Foster
Provost
Professor
NEC Faculty Fellow
Paduano Fellow in Business Ethics (Emeritus)
Dept.
of Information, Operations and Management Sciences
Leonard
N. Stern School of Business
New
York University
Affiliations: Department of Computer Science
_______________________
Some things:
Machine Learning for Display Advertising -- Keynote talk discussing privacy and efficacy of targeting on-line ads
Tutorial on Predictive Modeling with Social Networks (ICWSM 2009) [prior version @ KDD 2008]
Privacy-friendly on-line brand advertising @ KDD 2009
Repeated labeling -- Best paper award runner-up @ KDD 2008
______________________
Note to prospective Ph.D. students, postdocs, summer interns, etc.
______________________
Prof.
Provost has just retired from being Editor-in-Chief of the journal Machine
Learning.
Prof. Provost
won the 2009 INFORMS Design Science Award for his work on Social Network-based Marketing Systems. Previously he received IBM Faculty Awards for outstanding research in
data mining and machine learning. He was elected as a founding
board
member of the International
Machine
Learning Society. He is a member of the editorial boards of the
Journal
of Machine Learning Research (JMLR) and Data Mining and Knowledge Discovery. In 2001, he
co-chaired
the program of the premier data mining conference (ACM SIGKDD ).
(More bio info)
Publications
Music
Bio
Contact
Info
Teaching
Research
ACM SIGKDD
Students
Talks & Tutorials
Jefferson
Provost
Caterina Provost-Smith
Research: Machine Learning, Data Mining
&
Knowledge Systems
Once it was only a dream of
artificial
intelligence researchers that business systems would analyze data and "learn"
to improve their performance automatically. Now we interact with
such machines every day. Fortune 500 companies as well as startup companies use data mining
and
machine learning technologies to improve the performance of systems for
advertising, customer relationship management, fraud detection, marketing, monitoring, and more.
Current Research Topics
-
Mining social network data
- Tagging and repeated labeling, for knowledge discovery and data quality
- Active & costly acquisition of data for modeling
-
Behavior profiling
- Targeted on-line advertising
- Mining the business news
Oldies but goodies
Edited (with Ron
Kohavi of
Blue
Martini Software) a special
issue of the journal of Data Mining and Knowledge Discovery,
on eCommerce
and Data Mining. Available as a book.
Edited (also with Ron Kohavi) a special issue of the journal of Machine Learning on Applications and the Knowledge Discovery Process. Included an editorial essay discussing the contributions of applied research to the science of Machine Learning.
editorial
discussing contributions of applied research
International ACM
SIGKDD & the KDD Conference (Knowledge
Discovery
and Data Mining)
The ACM SIGKDD International Conference on
Knowledge
Discovery and Data Mining is the premier venue for presenting the
latest
KDD research and for the interaction of KDD researchers and
practitioners.
Professor Provost co-chaired the program
for KDD-2001,
which was held in San Francisco in August 2001.

He was the publicity chair for the 1998
and 1999
International Conferences on Knowledge Discovery and Data Mining.
Some Current Students
Josh Attenberg, NYU Poly
Xiaohan Zhang, NYU Stern
Some Prior Students
Maytal
Saar-Tsechansky, Associate Professor, Univ. Texas at Austin
Gary
Weiss, Associate Professor, Fordham University (Ph.D. from Rutgers
University, Computer Science; Co-advised with Haym Hirsh)
Claudia
Perlich, Chief Scientist, Media6degrees. (Formerly at IBM Research)
Shawndra
Hill, Assistant Professor, Wharton School, Univ. of Pennsylvania (Co-advised with Chris Volinsky of AT&T Research)
Brian Dalessandro, Director of Data Mining & Statistical Analysis, Media6degrees
Prior Postdocs
Sofus Macskassy, Director Fetch Labs @ Fetch Technologies. Assistant Adjunct Professor, USC
Victor (Shengli) Sheng, Assistant Professor, University of Central Arkansas
Talks & Tutorials
Tutorial on Predictive Modeling with Social Networks (ICWSM 2009)
Tutorial on Predictive Modeling with Social Networks (KDD 2008)
Tutorial on Social Network Mining (AAAI 2008) [ICWSM 2009 Tutorial above is more comprehensive]
ACM EC'07 Tutorial: Modeling Complex Networks for (Electronic) Commerce
ICML-2003 Invited Talk (see On Applied Research... for essays, etc.)
Publications
Working Papers
2011
2010
2009
2008
2007
- Social Network Collaborative Filtering: Preliminary Results . R. Zheng, F. Provost and A. Ghose. To appear in the Proceedings of the Sixth Workshop on eBusiness (WEB2007), December 2007.
- Learning and Inference in Massive Social Networks. S. Hill, F. Provost and C. Volinsky (2007). The 5th International Workshop on Mining and Learning with Graphs, August 2007.
- Handling Missing Features when Applying Classification Models. M. Saar-Tsechansky and F. Provost. Journal of Machine Learning Research 8(July):1625-1657 .
- Classification in Networked Data: A toolkit and a univariate case study. S. Macskassy and F. Provost Journal of Machine Learning Research 8(May):935--983, 2007.
- Decision-centric Active Learning of Binary-Outcome Models M. Saar-Tsechansky and F. Provost. Information Systems Research 18(1), March 2007, pp. 4-22. (Publisher's version)
- Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce. F. Provost, P. Melville, and M. Saar-Tsechansky. Proceedings of the Ninth International Conference on Electronic Commerce, August 2007.
- Modeling Complex Networks for (Electronic) Commerce. F. Provost and A. Sundararajan. Tutorial given at the 2007 ACM Conference on Electronic Commerce (EC07).
2006
2005
- An Intelligent Assistant for the Knowledge Discovery Process: An Ontology-based Approach A. Bernstein, F. Provost and S. Hill. IEEE Transactions on Knowledge and Data Engineering 17(4), pp. 503-518, 2005. (PDF) (Publisher's version)
- ROC Confidence Bands: An Empirical Evaluation. S. Macskassy, F. Provost, and S. Rosset. In Proceedings of the 22nd International Conference on Machine Learning (ICML-2005). [Also appears in the ICML-2005 Workshop on ROC Analysis in Machine Learning (ROCML-2005).]
- Suspicion scoring based on guilt-by-association, collective inference, and focused data access. S. Macskassy and F. Provost. In Proceedings of the 2005 International Conference on Intelligence Analysis (IA '05).
- An Expected Utility Approach to Active Feature-value Acquisition. P. Melville, M. Saar-Tsechansky, F. Provost, and R. Mooney. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM-2005), pp. 483-486. Also appeared in Proceedings of the KDD-05 Workshop on Utility-Based Data Mining, Chicago, IL, August 2005.
- Suspicion scoring based on guilt-by-association, collective inference, and focused data access. S. Macskassy and F. Provost. In Annual Conference of the North American Association for Computational Social and Organizational Science (NAACSOS), 2005. [This is a followup paper to the IA paper above, with new results but considerable overlap, and unfortunately with the same title.]
- NetKit-SRL: A Network Learning Toolkit and its use for classification of networked data. S. Macskassy and F. Provost. In Annual Conference of the North American Association for Computational Social and Organizational Science (NAACSOS), 2005.
- Pointwise ROC Confidence Bounds: An Empirical Evaluation. S. Macskassy, F. Provost, and S. Rosset. In the ICML-2005 Workshop on ROC Analysis in Machine Learning (ROCML-2005).
- Aggregation for Predictive Modeling with Relational Data. C. Perlich and F. Provost. Encyclopedia of Data Warehousing and Mining, 2005
- Toward a Justification of Meta-learning: Is the No Free Lunch Theorem a Show-stopper? C. Giraud-Carrier and F. Provost. In the ICML-2005 Workshop on Meta-Learning.
2004
- Active
Sampling for Class Probability Estimation and Ranking. M Saar-Tsechansky and F. Provost. Machine
Learning 54:2 2004, 153-178. (Publisher's version)
-
Active Feature-Value Acquisition for Classifier Induction. P. Melville, M. Saar-Tsechansky, F. Provost, and R. Mooney. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM-2004).
- Confidence
Bands for ROC Curves: Methods and an Empirical Study. S. Macskassy and F. Provost. In
Proceedings
of the First Workshop
on ROC Analysis in AI. August 2004.
- Knowledge Discovery Using Concept-Class Taxonomies. V. Kolluri, F. Provost, B. Buchanan, and D. Metzler. In AI 2004: Advances in Artificial Intelligence: 17th Australian Joint Conference on Artificial Intelligence. Lecture Notes in Computer Science, Springer-Verlag Heidelberg .
- The Gift of Gab: Evidence TelE-Commerce Firms can Profit from Viral Marketing. S. Hill, F. Provost and C. Volinsky. First Interdisciplinary Symposium between Information Systems, Statistics and Related Fields. Decision and Information Technologies Department, Robert H. Smith School of Business, Univ. of Maryland. May 2005
2003
- The
Myth of the Double-blind Review? Author Identification using only
Citations. S. Hill and F. Provost.
In SIGKDD Explorations 5(2), 2003, 179-184.
- Predicting
citation rates for physics papers: Constructing features for an ordered
probit model. C. Perlich, F. Provost, and S. Macskassy. In SIGKDD Explorations 5(2), 2003, 154-155.
- Aggregation-based
Feature Invention and Relational Concept Classes. C. Perlich and F. Provost. In
Proceedings
of the Ninth SIGKDD International Conference on Knowledge Discovery and
Data Mining (KDD-2003).
- Learning
when Training Data are Costly: The Effect of Class Distribution on Tree
Induction. G. Weiss and F. Provost. Journal of Artificial Intelligence Research
19 (2003) 315-354.
Prior versions:
- Tree
Induction vs. Logistic Regression: A Learning-curve Analysis. C. Perlich, F. Provost, and J. Simonoff. Journal
of Machine Learning Research 4 (2003) 211-255.
-
Preliminary version: CeDER Working Paper #IS-01-02, Stern School of
Business,
New York University, NY, NY 10012. Fall 2001. (preliminary
version: PDF,PS)
-
Tree
Induction for Probability-based Rankings. F. Provost. and P. Domingos. Machine
Learning
52:3.
(PDF) (Publisher's version)
- Invited
comment on Bolton and Hand’s “Statistical Fraud Detection.” F. Provost. Statistical
Science.
- The
Relational Vector-space Model and Industry Classification.
A. Bernstein, S. Clearwater, and F. Provost. Proceedings of the IJCAI-2003 Workshop on Learning Statistical Models
from
Relational Data.
- A
Simple Relational Classifier. S. Macskassy and F. Provost. Proceedings of the
KDD-2003
Workshop on Multirelational Data Mining.
- Relational
Learning Problems and Simple Models. F. Provost, C. Perlich, and S. Macskassy. Proceedings of the
IJCAI-2003
Workshop on Learning Statistical Models from Relational Data.
- Aggregation and
Concept
Complexity in Relational Learning. C. Perlich and F. Provost. Proceedings of the
IJCAI-2003
Workshop on Learning Statistical Models from Relational Data.
2002
-
Perlich, C. and F. Provost. "A
Modular Approach to Relational Data Mining." American
Conference
on Information Systems (AMCIS) 2002.
-
Bernstein, A., S. Clearwater, S. Hill, C.
Perlich, and
F. Provost. “Discovering Knowledge from
Relational
Data Extracted from Business News.” In Proceedings of the
KDD-2002 Workshop on Multi-Relational Data Mining, 2002.
-
Provost, F. and V. Kolluri, "Scalability."
In W. Kloesgen and J. Zytkow (eds.), Handbook of Knowledge
Discovery
and Data Mining.
-
Danyluk, A. and F. Provost, "Telecommunications
Network Diagnosis." In W. Kloesgen and J. Zytkow (eds.), Handbook
of Knowledge Discovery and Data Mining. (PDF)
-
Fawcett, T. and F. Provost, "Data
Mining for Fraud Detection." In W. Kloesgen and J. Zytkow
(eds.),
Handbook
of Knowledge Discovery and Data Mining.
2001
-
Saar-Tsechansky, M. and F. Provost, "Active
Learning for Class Probability Estimation and Ranking." In Proceedings
of the Seventeenth International Joint Conference on Artificial
Intelligence
(IJCAI-01). [See also extended
version (to appear in Machine Learning, see above)]
-
Macskassy, S., H. Hirsh, F. Provost, R.
Sankaranarayanan,
V. Dhar. “Intelligent
Information Triage.” In Proceedings of SIGIR-2001.
-
Bernstein, A. and F. Provost. "An
Intelligent
Assistant for the Knowledge Discovery Process." In
Proceedings
of IJCAI-01 Workshop on Wrappers for Performance Enhancement in KDD. (CeDER
Working Paper #IS-01-01, Stern School of Business, New York University,
January 2001.)
-
Provost, F. and T. Fawcett, "Robust
Classification for Imprecise Environments." Machine
Learning
42,
203-231, 2001. (PDF) (Publisher's version)
-
Kohavi, R. and F. Provost (eds.), Special
Issue on Applications of Data Mining to Electronic Commerce, Data
Mining and Knowledge Discovery 5, (1/2) 2001.
-
Kohavi, R. and F. Provost, Applications
of Data Mining to Electronic Commerce (introductory article), Data
Mining and Knowledge Discovery 5, (1/2) 2001.
2000
-
Dhar, V., D. Chou, and F. Provost, "Discovering
Interesting Patterns for Investment Decision Making with GLOWER -- A
Genetic
Learner Overlaid With Entropy Reduction," Data Mining and
Knowledge
Discovery 4(4) 2000.
-
Provost, F. "Learning
with Imbalanced Data Sets 101." Invited paper for the AAAI'2000
Workshop
on Imbalanced Data Sets.
-
Provost, F., D. Jensen and T. Oates, "Progressive
Sampling." In H. Liu and H. Motoda (eds.), Instance
Selection
and Construction, A Data Mining Perspective.
-
Provost, F., "Distributed
Data Mining: Scaling up and beyond." In H. Kargupta and P. Chan
(eds.), Advances in Distributed Data Mining, San Francisco, CA:
Morgan Kaufmann.
-
Provost, F., and P. Domingos. "Well-trained
PETs: Improving Probability Estimation Trees" CeDER Working Paper
#IS-00-04,
Stern School of Business, New York University, NY, NY 10012 (PDF)
[Journal
version is much improved.]
1999
-
Provost, F. and V. Kolluri, "A
Survey of Methods for Scaling Up Inductive Algorithms." Data
Mining and Knowledge Discovery 3 (1999). (PDF)
-
Provost, F. and A. Danyluk, "Problem
Definition, Data Cleaning and Evaluation: A Classifier Learning Case
Study."
Informatica
23
(1999). (PDF)
-
Provost, F., D. Jensen and T. Oates, "Efficient
Progressive Sampling." Proceedings of the Fifth International
Conference
on Knowledge Discovery and Data Mining (KDD-99).
-
T. Fawcett and F. Provost, "Activity
Monitoring: Noticing Interesting Changes in Behavior." Proceedings
of the Fifth International Conference on Knowledge Discovery and Data
Mining
(KDD-99).
-
Danyluk, A., T. Fawcett, and F. Provost, "AI
Approaches
to Time-series Problems." Workshop report in AI Magazine,
1999.
-
Provost, F. and D. Jensen, "Evaluating
Machine Learning,
Knowledge Discovery, and Data Mining." Tutorial presented at the
Sixteenth
International Joint Conference on Artificial Intelligence (IJCAI-99)
and
at the Sixteenth National Conference on Artificial Intelligence
(AAAI-99).
(abstract
| links)
-
Provost, F., J. Aronis, and B. Buchanan. "Rule-space
search for knowledge-based discovery." CIIO Working Paper #IS
99-012, Stern School of Business, New York University, NY, NY 10012 (PDF)
1998
-
Provost, F. and R. Kohavi, "On
Applied Research in Machine Learning." Guest editorial in Machine
Learning 30 (2/3) 1998. (postscript)
-
Kohavi, R. and F. Provost (guest eds.), Special
issue on "Applications of Machine Learning and the Knowledge
Discovery
Process."
Machine Learning 30 (2/3) 1998. (table
of contents)
-
Provost, F. and T. Fawcett, "Robust
Classification Systems for Imprecise Environments." In Proceedings
of the Fifteenth National Conference on Artificial Intelligence (AAAI-98).
-
Provost, F., T. Fawcett, and R. Kohavi "The
Case Against Accuracy Estimation for Comparing Classifiers." In Proceedings
of the Fifteenth International Conference on Machine Learning (ICML-98).
-
Fawcett, T., I. Haimowitz, F. Provost, and S.
Stolfo,
"AI Approaches to Fraud Detection and Risk Management." Workshop
report in AI Magazine, 1998.
-
Provost, F. and D. Jensen, "Evaluating Data
Mining
and the Knowledge Discovered." Tutorial presented at the Fourth
International
Conference on Knowledge Discovery and Data Mining (KDD-98).
-
Fawcett, T. and F. Provost, "Automatic Design
of
Fraud Detection Systems" U.S. Patent #5,790,645.
1997
-
Fawcett, T. and F. Provost, "Adaptive
Fraud Detection."
Data Mining and Knowledge Discovery
1 (1997).
-
Provost, F. and T. Fawcett, "Analysis
and Visualization of Classifier Performance: Comparison under Imprecise
Class and Cost Distributions." In Proceedings of the Third
International
Conference on Knowledge Discovery and Data Mining (KDD-97).
Best
Paper Award Winner.
-
Provost, F. and V. Kolluri, "Scaling
Up Inductive Algorithms: An Overview." In Proceedings of the
Third
International Conference on Knowledge Discovery and Data Mining (KDD-97).
-
Aronis, J. and F. Provost, "Increasing
the Efficiency of Inductive Learning with Breadth-first Marker
Propagation."
In Proceedings of the Third International Conference on Knowledge
Discovery
and Data Mining (KDD-97).
-
Aronis, J., V. Kolluri, F. Provost, and B.
Buchanan,
"The
WoRLD:
Knowledge Discovery from Multiple Distributed Databases." In
Proc.
of the Florida Artificial Intelligence Research Symposium (FLAIRS-97).
1996
-
Provost, F. and J. Aronis, "Scaling
Up Machine Learning with Massive Parallelism." Machine Learning
23 (1996).
-
Provost, F. and D. Hennessy, "Scaling
Up: Distributed Machine Learning with Cooperation." In Proceedings
of the Thirteenth National Conference on Artificial Intelligence (AAAI-96).
-
Fawcett, T. and F. Provost, "Combining
Data Mining and Machine Learning for Effective User Profiling." In
Proceedings
of the Second International Conference on Knowledge Discovery and Data
Mining (KDD-96).
-
Aronis, J., F. Provost, and B. Buchanan, "Exploiting
Background Knowledge in Automated Discovery." In Proceedings of
the Second International Conference on Knowledge Discovery and Data
Mining
(KDD-96).
1995
-
Provost, F. and B. Buchanan, "Inductive
Policy: The
Pragmatics of Bias Selection." Machine Learning 20 (1995).
-
Krenzelok, E. and F. Provost, "The Ten Most
Common
Plant Exposures Reported to Poison Information Centers in the United
States."
Journal
of Natural Toxins (1995).
-
Provost, F. and A. Danyluk, "Learning from
Bad Data."
In Proceedings of the ML-95 Workshop on Applying Machine Learning
in
Practice, 1995.
-
Krenzelok, E., F. Provost, T. Jacobsen, J.
Aronis, B.
Buchanan, "Assessing Patient Referral Patterns to a Health Care
Facility
in Plant Exposure Patients Using Computer Artificial Intelligence."
European Association of Poison Centres and Clinical Toxicologists
Scientific
Meeting. May 18-20, 1995, Krakow, Poland.
-
Krenzelok, E., F. Provost, T. Jacobsen, J.
Aronis, B.
Buchanan, "Poinsettia (Euphorbia pulcherrima) Exposures Have Good
Outcomes...Just
As We Thought." European Association of Poison Centres and Clinical
Toxicologists Scientific Meeting, 1995.
1994
-
Provost, F. and D. Hennessy, "Distributed
Machine
Learning: Scaling up with Coarse-grained Parallelism." In Proc
of
the Second International Conference on Intelligent Systems for
Molecular
Biology (ISMB-94).
-
Aronis, J. and F. Provost, "Efficiently
Constructing Relational Features from Background Knowledge for
Inductive
Machine Learning" In Proceedings of the AAAI-94 Workshop on
Knowledge
Discovery in Databases, (KDD-94).
-
Provost, F., "Goal-Directed Inductive
Learning: Trading
Off Accuracy for Reduced Error Cost." In Proceedings of the
AAAI
Spring Symposium on Goal-Directed Learning, 1994.
1993
-
Provost, F., "Iterative Weakening: Optimal
and Near-Optimal
Policies for the Selection of Search Bias." In Proceedings of
the
Eleventh National Conference on Artificial Intelligence (AAAI-93).
-
Danyluk, A. and F. Provost, "Small Disjuncts
in Action:
Learning to Diagnose Errors in the Telephone Network Local Loop."
In
Proceedings
of the Tenth International Conference on Machine Learning (ICML-93).
-
Danyluk, A. and F. Provost, "Adaptive Expert
Systems:
Applying Machine Learning to NYNEX MAX." In Proceedings of the
AAAI-93
Workshop: AI in Service and Support--Bridging the Gap between Research
and Applications, 1993.
1992
-
Provost, F. and R. Melhem, "A Distributed
Algorithm
for Embedding Trees in Hypercubes with Modifications for Run-Time Fault
Tolerance." Journal of Parallel and Distributed Computing
14
(1992).
-
Provost, F. and B. Buchanan, "Inductive Policy."
In
Proceedings of the Tenth National Conference on Artificial
Intelligence(AAAI-92).
-
Provost, F., "ClimBS: Searching the Bias Space."
In
Proceedings of the Fourth International IEEE Conference on Tools
with Artificial Intelligence(TAI-92).
-
Provost, F., "A Baseline Taxonomy of Bias
Adjustment
Policies." In Proceedings of the ML-92 Workshop on Biases in
Learning,
1992.
-
Provost, F. and B. Buchanan, "Inductive
Strengthening:
The effects of a simple heuristic for restricting hypothesis space
search."
In K.P. Jantke (ed.), Analogical and Inductive Inference (Lecture
Notes
in Artificial Intelligence 642). Springer-Verlag, 1992.
-
Clearwater, S., W. Cleland, F. Provost, E. Stern
and
Z. Zhang, "A Real-Time Expert System for Experimental High
Energy/Nuclear
Physics." In D. Perrett-Gallix and W. Wojcik (eds.), New
Computing
Techniques in Physics Research. Paris: Centre National de la
Recherche
Scientific, 1990.
1991
-
Provost, F. and R. Melhem, "Embedding Rings
in Hypercubes
for Run-Time Fault Tolerance." In Proceedings of the Fourth
ISMM/IASTED
Intl. Conference on Parallel and Distributed Computing and Systems,
1991.
1990
-
Clearwater, S., W. Cleland, F. Provost, E. Stern
and
Z. Zhang, "A Real-Time Expert System for Trigger Logic Monitoring."
Nuclear
Instruments and Methods in Physics Research A293 (1990).
-
Clearwater, S. and F. Provost, "RL4: A Tool
for Knowledge-Based
Induction." In Proceedings of the Second International IEEE
Conference
on Tools for Artificial Intelligence (TAI-90).
1989
-
Provost F. and R. Melhem, "Distributed Fault
Tolerant
Embedding of Trees and Rings in Hypercubes." In I. Koren (ed.) Defect
and Fault Tolerance in VLSI systems Volume 1. New York, NY: Plenum
Press, 1989
Teaching
Professor Provost teaches Data Mining for Business Intelligence to graduate and undergraduate classes.
In spring 2011 he will teach a new graduate class: Networks Crowds and Markets