DTAI Thesis Topics Dept. Computer Science KU Leuven 2019-2020 De Raedt
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
DTAI Thesis Topics Dept. Computer Science KU Leuven 2019-2020 http://people.cs.kuleuven.be/~luc.deraedt/dtaithesis18-19.pdf Luc De Raedt
Lab for Declarative Languages and Artificial Intelligence Machine Learning 4 ZAP, 1 res. manager ± 4 post-docs ± 25 Ph.D. students Declarative Languages and Systems 4 ZAP ± 3 post-docs ±12 Ph.D. students Demoen retired still interested Bruynooghe retired in education in informatics 2
AI is hot! Self-driving cars - Eve (the robot scientist) Siri IBM Watson in Jeopardy and “Machine Reading” AlphaGo — (Deep) learning … 3
DTAI's focus on AI Machine Learning & Data Mining how to extract knowledge from data Uncertainty reasoning how to represent and reason about uncertainty Knowledge Representation how to represent and reason about knowledge Learning and Reasoning 4
DTAI's focus on Declarative Languages Declarative = specify the what rather than the how Different types of languages Logic Functional Constraints Probabilistic Explainable / Understandable AI 5
DTAI's methodology involves Fundamental research (theoretical as well as empirical) Systems, Solvers and Software Applications Thesis can focus on one or more aspects, depending on interests student This presentation does not go in depth about techniques but every thesis does 6
This presentation Overview of research illustrations of possible thesis topics. List of contact persons for topics Full information — see online Own topic should be aligned with interests professor 7
Research Topics Probabilistic Programming and Predictive Learning and Automated Data Science Statistical Relational Learning Clustering Graph and Network Mining Exploratory Data Mining Privacy, Non-discrimination and Ethical aspects Knowledge-Base Systems Constraints Verification of AI and ML Static Analysis for Declarative Functional Programming Programming Languages (dtai.cs.kuleuven.be/research )
Applications Sports Analytics Health Engineering & Sensors Robotics Games Text and Web Computational Creativity Applications of the Knowledge Base System Paradigm (dtai.cs.kuleuven.be/research ) 9
Research topics
Automated Data Science Contact: Luc De Raedt Can we (partly) automate data science ? Can we automatically derive the right features ? the right representations ? Can we automatically discover what we can learn / predict ? Can we learn constraints ? Example database about students, professors, courses, and marks … The SYNTH project — the democratisation of Data Science the automation of Data Science 11
Automated Data Science Contact: Luc De Raedt, Hendrik Blockeel The Magical Ice Cream Factory 12
Automated Data Science Contact: Luc De Raedt, Anton Dries Inductive Motivation:Programming Flash-fill FlashFill in Excel 2 / 28 13
Automated Data Science Contact: Luc De Raedt Can we recover Learning formulas from/ aprogram constraints CSV file? synthesis I What are the formulas here? I T1[:, 6] = SUM(T1[:, 3:5], row) I T2[:, 2] = SUMIF(T1[:, 1]=T2[:, 1], T1[:, 6]) 14
Predictive learning and clustering Contact: Hendrik Blockeel, Jesse Davis Basic idea: include in a single ensemble, trees predic variables from many other variables X2 X3 X1 X4 X6 X5 15
Predictive learning and clustering Contact: Hendrik Blockeel, Jesse Davis “Standard” machine learning develop new algorithms for machine learning Decision Trees Predictive Clustering Probabilistic Graphical Models evaluation of machine learning (ROC etc.) 16
Probabilistic Programming and Statistical Relational Learning Contact: Luc De Raedt, Hendrik Blockeel, Jesse Davis, Gerda Janssens Key open question in AI — integrate Probabilistic reasoning Logical or relational Machine learning representations statistical relational probabilistic programming learning 17
We first review some basic concepts of logic programming: An atom pred(t1 , ..., tn ) consists of a predicate pred/n of arity n and ti terms. A term is either a (lower- Probabilistic Programming case) constant, and a (uppercase) variable, or a functor f unc/n applied on n terms. Statistical Relational A definite clauseLearning is an expression of the form h b1 , ..., bn , where h and the bi are atoms. It states that h is true whenever all bi are true. If n is 0, we have a fact f , which expresses that f is true. A substitution ✓ = {X1 = t1 , ..., Xn = tn } maps each variable Xi to a term ti . Applying a substitution ✓ to an atom a yields a✓, in which each occurrence of Xi in a is replaced with ti . A ProbLog [12, 2] program consists of a set of labeled facts pi :: ci , where pi E.g. ProbLog: is a probability valueaand probabilistic Prolog ci a fact, and a set of definite clauses. Each ground instance of such a fact represents a random variable that is true with probability pi . We use the following ProbLog program as a running example in the paper: 0.05 :: burglary. alarm :- burglary. 0.01 :: earthquake. alarm :- earthquake. 0.7 :: hears_alarm(john). calls(Pers) :- alarm, hears_alarm(Pers). 0.6 :: hears_alarm(mary). It has the random variables: burglary, earthquake, hears alarm(john) and P( hears_alarm(john) | burglary = true) ? hears alarm(mary), and states that there is an alarm whenever there is burglary or an earthquake. The last clause states that if there is an alarm and a person hears the alarm, that person will call. Challenges on inference, To model univariate learning, discrete distributions (e.g.,implementation, uniform, Poisson), we also application, ... allow for discrete distribution probabilistic facts X ⇠ :: f . X is a logical variable appearing in atom f and a probability density function. For example, X ⇠ unif orm(7) :: apples(X) specifies that apples(X) is true with X sampled from the set of integers between 1 and 7 with equal probability. Each grounding of all the variables (except X) in f denotes a random variable. All random variables 18
Probabilistic Programming and Statistical Relational Learning Action and activity learning / Dynamics Travian: A massively multiplayer real-time strategy game Commercial game run by TravianGames GmbH ~3.000.000 players spread over different “worlds” Can we build a model of this world ? Can we use it for playing better ? [Thon et al. ECML 08] 19
Logic + Probability + Neural Networks Contact: Hendrik Blockeel, Luc De Raedt + = 16 + =3 + =? + =4 Data Query Answer Answer DeepProbLog [Manhaeve NeurIPS 2018] 20
Robotics (and Vision) Contact: LucReality is De Raedt harder Winograd’s SHRDLU Put diagram the blue adapted from Winograd, Understanding Natural Language (1972) pyramid on the block in the box Bring me http://www.wiley.com/college/busin/icmis/oakman/outline/chap11/slides/blocks.htm the tea pot and the sugar First-MM ● Details are important! For reasoning, planning... ● We cannot ignore position, orientation, shape, physics, etc... The CLEVR Dataset ● High-level concepts still useful (objects, andproperties Variations and relations, background knowledge)
Robotics Contact: Luc De Raedt Learn probabilistic - logic model Moldovan et al. ICRA 12, 13, 14 Shelf Shelf grasp Shelf tap push 22
Verifying AI & ML systems Contact: Luc De Raedt, Hendrik Blockeel, Jesse Davis, Bettina Berendt & Wannes Meert Verification of software has a long tradition (eg model checking techniques) How to verify systems that learn ? that use AI ? Our approach — combined principles of probabilistic logics with verification Topics inductive synthesis of specifications Markov Decision Processes (& reinforcement leanring) Derive properties of learned systems … 23
Socially Aware Data Mining Graph and Network Mining Contact: Bettina Berendt Help users manage friends and privacy by data mining Focus on Privacy and (anti-discrimination) 24
Text and Web Contact: Bettina Berendt, Jesse Davis Extraction of information from the web / social media Taxonomy learning Machine reading / Natural language processing NaturalMachine reading … 25
Knowledge-Base Systems Contact: Marc Denecker, Gerda Janssens IDP Advanced KBS system developed by group FO(.) language rooted in predicate logic and logic programming separation of domain knowledge and problem solving Language extensions to increase expressivity E.g. design patterns for FO(.) (past thesis) Better solvers and more inference methods E.g. a solver for rational numbers (past thesis) 26
Knowledge-Base Systems Contact: Marc Denecker, Gerda Janssens Three themes for students : logical modeling of interesting AI problem + expressing AI knowledge domains logical analysis and implementation of software systems and tasks + software by applying inference on specifications Advanced algorithmics and implementation + extending/optimising the IDP software package. 27
Applications of the Knowledge Base System Paradigm Logical modeling of AI problems DAG manuscripts Analysing medieval coloring & extension vocabulary Vms { extern vocabulary V IsSource(Manuscript ) } theory Tms : Vms { { ! x : IsSource(x)
Applications of the Knowledge Base System Paradigm Contact: Marc Denecker, Gerda Janssens Software = Knowledge Base + Logical Inference + User Interface E.g., An interactive configuration system for an insurance company AIM : Build cheap, correct, reusable, maintainable software from a logical specification 29
Applications of the Knowledge Base System Paradigm Winning the RuleML Challenge Insurance application Propagation constraints and choices Fill out necessary values 30
Knowledge-Base Systems Contact: Marc Denecker, Gerda Janssens Advanced algorithmics and implementation + extending/ optimising the IDP software package. help us win the next CP or ASP competition + E.g., structuring search space as a hierarchy of search problems + E.g., linear programming techniques in IDP + E.g., improved computation of definitions + E.g., algorithms for revision inference (updating solutions) 31
Constraints Contact: Tom Schrijvers, Marc Denecker, & Luc De Raedt • Hyper heuristics to solve constraint satisfaction and optimization problems — formalisation • Search Heuristics • Role in IDP • Role in Data Mining • Learning of constraints 32
UITLEG: Functional Programming Functional Programming Je kent Functional Programming van de ta het vak Declaratieve Talen. Contact: Tom Schrijvers Op onderzoeksgebied werken we rond alle Haskell functionele talen, en Haskell in het bijzond Actuele onderwerpen zijn: - expliciete side-effects zoals monads, ★ Explicit Side-Effects - gevorderde type system features - domein-specifieke talen Monads Transformers Effect Handlers ★ Advanced Type Systems Type Classes Polymorphism Kinds ★ Domain-Specific Languages Design Infrastructure Applications ★ Much more… 33
Functional Programming 25 Widespread Adoption Early Adopters Haskell Language + GHC Compiler UITLEG: Heel wat interessante uitdagingen komen voort uit de Haskell Finance groeiende mainstream adoptie van Functional Telecom Programming. Many Others in Hoe langer hoe meer bedrijven gaan aan de slag met functionele talen zoals Haskell en F# (F-sharp), industry 12 en mainstream talen zoals Java en C# adopteren functionele concepten. Anonymous Functions Functional Languages Mainstream FP now 1936 1958 1973 1987 2007 2014 λ calculus Lisp ML Haskell C# Java 8 2011 mainstream C++11 Swift Alonzo John Robin Haskell Church McCarthy Milner Committee 34
Functional Programming 201: The Oracle of Haskell abs x | x >= 0 = x | x < 0 = -x GHC your oracle compiler ✓exhaustive guards UITLEG: ontwikkel een orakel dat nagaat of guards in Haskell- programma’s alle gevallen dekken 35
Static Analysis for Declarative Programming Languages Declarative Programming Languages Contact: Tom Schrijvers, Gerda Janssens UITLEG: Je kent de Declaratieve Taal Prolog uit het vak Declaratieve Talen. Op onderzoeksgebied werken we rond de automatische analyse van Prolog-programma’s. Actuele onderwerpen zijn: - een type checker om Prolog statisch getypeerd te maken - de eindigheid van programma’s te bepalen ★ Type Checking - analyseren van complexe control flow zoals coroutines ★ Termination Analysis ★ Reasoning about Coroutines 36
Declarative Programming Automatically Inferring Languages Properties of Interest powerful dynamic flexible append([],L,L). UITLEG: append([X|Xs],Ys,[X|Zs]) :- Delcaratieve talen zoals Prolog zijn heel krachtig, dynamisch en flexibel. append(Xs,Ys,Zs). De uitdaging bestaat erin om automatisch belangrijke eigenschappen af te leiden van Prolog programma’s om na te gaan of ze correct zijn, altijd eindigen en hoe je ze efficient kan compileren. optimisation correctness termination 37
Delcarative Programming Industrial-Strength Languages Static Types for Prolog Prolog + Types Program Case Study: Industrial Partner your type checker Prosyn UITLEG: Expert System Prolog is een ongetypeerde taal. Hierdoor is het makkelijke om via schrijffouten moeilijk op te sporen bugs te introduceren. 1 MegaLoC In deze thesis ontwikkel je een type systeem voor bugs Prolog: Prolog De programmeur schrijft type-signaturen voor zijn predikaten, en jouw type checker gebruikt die om bugs op te sporen. 38 Je evalueert je type checker op het Prosyn expert
Application Areas
• Airplanes collect many flight parameters Industry • Airplane health & reliability Questio extremely important • BUT: Ground maintenance Sources: checks cost flying time • http://www.b737.org. Contact: Wannes Meert • Anomaly Detection B Lacaille, Proceeding • Automating diagnostics and • Learning (Benelearn http://techcrunch.com predicting when the airplane • • Boeing 737 Bleed Ai Boeing 737NG Aircra Theses with: will need repairs = win-win Section (SDS), ATA Boeing Jetairfly EuroMillions Basketball League 3 3E mage source: http://www.b737.org.uk/737ng.htm Sirris Thomson-Reuters Xenit 4.2. Estimating the skeleton configuratio Pepite Melexis Flanders Make imec Cern … 40
Sports Analytics Contact: Jesse Davis Machine Learning for sports Soccer & basketball E-sports 41
Sports Analytics Tasks Strategy detection Performance analysis & prediction Scouting 42
Sports Analytics Thesis Topics Soccer analytics Model flow of a game Quantify team performance Learn aging curves of players Basketball analytics Detect surprising events 43
Health Tasks Continuous monitoring Injury risk profiles 44
Health Thesis topics Performance management and Injury prevention Sensor fusion for surface detection and skill detection in runners Kinect monitoring for qualitative feedback during rehabilitation 45
Anomaly Detection Typically, no usage at Contact: Jesse Davis, Hendrik Blockeel, night, Wannes Meert Except for sporadic maintenance Anomalies are behaviors that do not conform to what is expected Anomalies typical entail significant costs such as fraudulent credit card transaction, excess usage, etc. Topics: Design new algorithms to detect anomalies, Applications, e.g., airplanes, CERN, resources
Engineering & Sensors Contact: Wannes Meert, Jesse Davis, Hendrik Blockeel, Luc De Raedt ght ty ce sing Analysing data from airplanes Large Hadron Collider maintenance (CERN) Anomaly Detection
Engineering & Sensors The automatic Engineer Contact: Wannes Meert http://dtai.cs.kuleuven.be Example use case: Automatic Engineer Goal: Learn constraints and programs over heterogeneous knowledge sources to assist engineers in proposing new designs, finding similar designs, and verifying designs. Probabilistic programming Measurements Technical drawings Standards Spreadsheets Active learning Constraint programming
AI Challenges Games Contact: Luc De Raedt, Jesse Davis, Anton Dries, Hendrik Blockeel learning to solve science tests formulated in natural language (like SAT, GMAT, GRE, …) Tests as a testbed for intelligent behavior, for “reasoning” Allen AI Institute, Levesque’s Winograd test, IBM Watson … 49
d3 pe T h rce ree m Problem e m a r d e m pro l e b ble bili a r ba e n s a from ty th Th n r 4 s on the from e p t an achi h an erce d 20 nes s . f w i t k e t is e s e o t a k d de 5 pe ntage perc , B a A 3 tim sum b ag He Wha e ta fec r en nrdown at th e a s. d . le h d? tive c e o f t osf th C ypth . s a rb l e e r arb re n t de t t 5 h i s o . W res fecie i hebilit ro eas A gMinike n ma d it d m s als ha peAc d tivoeba totaat lduc heand an con g i t is teivepr pites is l p e5 re theth ly. do ce rod 0 p ogf 52bag ceosnesist ba d e Fin m prothb One is 3 ction rcen diamto s u e he actatrhds, c s of 10 c ac abili cho per r t,th3 ntd hs, 1 onta ards hin ty t ose cen spe e0 prob e 3 clu i n i ng 1 from e A ha c a b s , 3 h a de s t its a p 4 p ivel t , t b i lity th a nd 1 e a rts, 1 ck ? n a 3 spad 3 l a tio orig iece erce y. cards tag i n e s p opu On a f inate . It nt of th e s ame hand h a . Find o f the ase. cent o s fr is o s u i s all 10 n t i s e e r 1 m t . e rce tain d 98 p while e . 1 p c e r s e , s ult s itiv In a g e 0 a i s e a re p o roup p o s w it h d i ti v e e a is b r own of 10 p d th e o s i v n e yes. peop Su fecte t for a p t e d g e r s o t is t h T w le, 60 in s te d gi v e fe c n p h a ? e g roup. o p eople perce a l i n s e l t , w s e What nt ha e dic fecte e not ly cho resu disea neith e i s t are s e l e ve m e in s e r o h e c t ed fro o s o f tho ndom ositiv s the f them h prob a b m t h n t r a a p h a a s b i l i t y ce If a s on rown eyes at th per esult. d give e pers ? r a n t h ted abilit y t e s p rob the
d3 pe T h rce ree m Problem e m a r d e m pro l e b ble bili a r ba e n s a from ty th Th n r 4 s on the from e p t an achi h an erce d 20 nes s . f w i t k e t is e s e o t a k d de 5 pe ntage perc , B a A 3 tim sum b ag He Wha e ta fec r en nrdown at th e a s. d . le h d? tive c e o f t osf th C ypth . s a rb l e e r arb re n t de t t 5 h i s o . W res fecie i hebilit ro eas A gMinike n ma d it d m s als ha peAc d tivoeba totaat lduc heand an con g i t is teivepr pites is l p e5 re theth ly. do ce rod 0 p ogf 52bag ceosnesist ba d e GOAL: solve the Fin m prothb One is 3 ction rcen diamto ac abili cho per s u r e e t,th3 he actatrhds, c s of 10 c ntd hs, 1 onta i n ards from hin ty t ose cen spe e0 prob 3 clu i ng 1 a de problem directly e A ha ? l a n s t its a p 4 p ivel t tio orig iece erce y. cards , c t a b i lity th b a s tag , a nd 1 i n 3 3 spad h e a rts, 1 ck e 3 opu On a f inate . It nt of th hand s . Find r n t o f ce tain d 98 p while ei p the ase. cent o s e from text e r 1 s fr is o m e s ame s u i t . h a s all 10 p e , lt In a g 0 . 1 a c e r e a s e re s u os itiv roup o s e it h d i s v e a p b r own of 10 p p d w th e o s i ti i v e n is e peop Su fecte t for p d g r s o is yes. le, 60 s e a c t e p e a t t h e T w o in l te d gi v i n fe e n , w h e ? g roup. p eople perce a s l t s What nt ha e dic fecte e not ly cho resu disea neith e i s t are s e l e ve m e in s e r o h e c t ed fro o s o f tho ndom ositiv s the f them h prob a b m t h n t r a a p h a a s b i l i t y ce If a s on rown eyes at th per esult. d give e pers ? r a n t h ted abilit y t e s p rob the
Computational Creativity Games I like my men like I like my graves: nameless. Contact: Luc De Raedt I like my coffee like I like my country: cold. Algoritmic perspective on creative behaviour (Help) generate e.g. humor, music, … Thesis Thomas Winters 52
Artificial intelligence, reasoning about uncertainty, action- and activity learning, machine learning, data mining, constraint programming, probabilistic programming (ProbLog), automated data science, language for mining and learning. Luc De Raedt Applications in natural language, vision, robotics, automatic programming. Verification of AI and ML. Computational Creativity. Machine learning, data mining, probabilistic logics, declarative languages for data mining. Hendrik Blockeel Application domains include bio-informatics, arts, history, compiler development, optimization. Machine learning, data mining for personalized medicine. Artificial intelligence, statistical relational learning, transfer learning, anomaly detection Jesse Davis Applications in healthcare (e.g., clinical practice, physical therapy, medical and biological texts, etc.). Applications to sport (e.g., football and basketball) Bettina Berendt Web mining, privacy, social media, user issues Probabilistic programming and methods. Data Science Applications. Applications in Wannes Meert engineering. Collaborations with industry. 53
functional programming, constraint and logic programming, type systems, Tom Schrijvers programming language theory, programming language design and implementation, program analysis Constraint programming, Knowledge Base Systems, SAT solving, declarative languages (formal modelling languages), Marc Denecker Applications in configuration, scheduling, optimization, security, business rule systems, executable formal software specifications, logical workflow languages. Performant probabilistic ILP data mining systems, integration of logic programming techniques in the knowledge representation language FO(.), program analysis and Gerda Janssens abstract interpretation, implementations of logic programs, verification of functional equivalence of C programs Bart Demoen Schools onderwijs in de informatica / Education in informatics Check out dtai-web for more details 54
Questions ? Advisable to contact promotors or daily advisors before selecting a topic Also, attend thesis info market after Easter Holidays
You can also read