A Joint Learning Approach to Intelligent Job Interview Assessment - IJCAI
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) A Joint Learning Approach to Intelligent Job Interview Assessment Dazhong Shen1,2 , Hengshu Zhu2,∗ , Chen Zhu2 , Tong Xu1,2 , Chao Ma2 , Hui Xiong1,2,3,4,∗ 1 Anhui Province Key Lab of Big Data Analysis and Application, University of S&T of China, 2 Baidu Talent Intelligence Center, 3 Business Intelligence Lab, Baidu Research, 4 National Engineering Laboratory of Deep Learning Technology and Application, China. sdz@mail.ustc.edu.cn, {zhuhengshu, zhuchen02, machao13}@baidu.com, tongxu@ustc.edu.cn, xionghui@gmail.com Abstract process, traditional interview process has a substantial risk of The job interview is considered as one of the most bias due to the subjective nature of the process. This situa- essential tasks in talent recruitment, which forms a tion could be even more severe, since different interviewers bridge between candidates and employers in fitting may have different technical backgrounds or different experi- the right person for the right job. While substantial ence levels in personal qualities. This may lead to a biased or efforts have been made on improving the job inter- incomplete assessment of job candidate. view process, it is inevitable to have biased or in- Recently, the Artificial Intelligence (AI) trend has made its consistent interview assessment due to the subjec- way to talent recruitment, such as job recommendation [Ma- tive nature of the traditional interview process. To linowski et al., 2006; Paparrizos et al., 2011; Zhang et al., this end, in this paper, we propose a novel approach 2014], talent mapping [Xu et al., 2016], and market trend to intelligent job interview assessment by learning analysis [Zhu et al., 2016]. However, fewer efforts have been the large-scale real-world interview data. Specifi- made on enhancing the quality and experience of job inter- cally, we develop a latent variable model named view. A critical challenge along this line is how to reveal Joint Learning Model on Interview Assessment the latent relationships between job position and candidate, (JLMIA) to jointly model job description, candi- and further form perspectives for effective interview assess- date resume and interview assessment. JLMIA can ment. Intuitively, experienced interviewers could discover the effectively learn the representative perspectives of topic-level correlation between job description and resume, different job interview processes from the success- and then design the interview details to measure the suit- ful job interview records in history. Therefore, a va- ability of applicants. For example, a candidate for “Soft- riety of applications in job interviews can be en- ware Engineer”, who has strong academic background, might abled, such as person-job fit and interview ques- be interviewed with questions not only about “Algorithm”, tion recommendation. Extensive experiments con- “Programming”, but also “Research”. Meanwhile, compared ducted on real-world data clearly validate the effec- with the technical interview, the vocabulary of comprehensive tiveness of JLMIA, which can lead to substantially interview could be largely different. less bias in job interviews and provide a valuable To this end, we propose a novel approach to intelli- understanding of job interview assessment. gent job interview assessment by learning the large-scale 1 Introduction real-world interview data. Specifically, we develop a latent variable model named Joint Learning Model on Interview As one of the most important functions in human resource Assessment (JLMIA) to jointly model job description, candi- management, talent recruitment aims on acquiring the right date resume and interview assessment. JLMIA can effectively talents for organizations and always has direct impact on learn the representative perspectives of different job interview business success. As indicated in an article from Forbes, US processes from the historical successful job interview records. corporations spend nearly 72 billion dollars each year on Also, two categories of interviews, technical and comprehen- a variety of recruiting services, and the worldwide amount sive interviews, which are hosted by technical and managerial is likely three times bigger [Bersin, 2013] In particular, job interviewers respectively, could be well differentiated. Fur- interview, which is considered as one of the most useful tools thermore, based on JLMIA, we also provide solutions for and the final testing ground for evaluating potential employ- two applications named person-job fit and interview question ees in the hiring process, has attracted more and more atten- recommendation. Extensive experiments conducted on real- tions in human resource management. While substantial ef- world data clearly validate the effectiveness of JLMIA, which forts have been made on the improvement of job interview can lead to substantially less bias in job interviews and pro- ∗ Corresponding Author. vide a valuable understanding of job interview assessment. 3542
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Algorithm 1: The Generative Process of JLMIA for Re- sume and Interview Assessment 1. For each topic k of candidate interview record: (a) Draw ϕR R k from the Dirichlet prior Dir(β ). (b) Draw ϕk and ϕk from the Dirichlet prior Dir(β E ). ET EC 2. For each job description Jm : J (a) Sample topic distribution θm ∼ Dir(α). 3. For each candidate interview record pair (Rmd , Emd , Imd ): A J (a) Sample topic distribution θmd ∼ N (h(θm , C), δ 2 I) R Figure 1: The graphical representation of JLMIA. (b) For the r-th word wmdr in resume Rmd : R A i. Draw topic assignment zmdr ∼ M ulti(π(θmd )). 2 Problem Statement R ii. Draw word wmdr ∼ M ulti(ϕR ). zR mdr Formally, our data set contains the recruitment documents E |M | (c) For the e-th word wmde in interview assessment Emd : of |M | unique jobs, i.e., S = {Sm = (Jm , Am )}m=1 , E i. Draw topic assignment zmde A ∼ M ulti(π(θmd )). where Jm is the job description of the m-th job and Am E ET ii. Draw word wmde ∼ M ulti(ϕzE ) (Imde ==T I). is the interview records of this job. Specifically, Am = mde E {(Rmd , Emd )}d=1 |Dm | contains |Dm | interviews, where Rmd is iii. Draw word wmde ∼ M ulti(ϕEC zE ) (Imde ==CI). mde the resume of candidate in d-th interview, and Emd is the cor- responding interview assessment. Since all of the job descrip- our tasks are further transformed to model the relationships tions, resumes, and interview assessments are textual data, we among these latent topics. First, to model the strong corre- J Nm J lation between resume Rmd and interview assessment Emd , use bag-of-words to represent them, e.g., Jm = {wmj }j=1 , we directly assume they share the same tuple-specific distri- similar to Rmd and Emd . bution θmdA over topics. Second, for revealing the relation- A job description Jm contains detailed job requirements, ships between job descriptions and resumes along with the and a resume Rmd mainly consists of the past experiences differences between their diversity, we generate θmd A from of this candidate that can reflect her abilities. Meanwhile, the the logistic-normal distribution with mean parameter related evaluation about a candidate in interview assessments bridges to the topic distribution of job description θm J . And the to- the gap between job requirements and her ability. And accord- pic numbers of ϕ , ϕ and ϕ are set as |k E | = |k R | = J R E ing to the goal of interviews, interview assessments can be C · |k J | = CK. In other words, for each topic in ϕJ , there further divided into technical and comprehensive interview. are C topics in ϕR (ϕE ) related to it. Third, we use a label As we known, during the interview, interviewers tend to I ∈ {T I, CI} (e.g., Technical Interview or Comprehensive ask questions related to the work experiences of candidates. Interview) to indicate the type of interview for each interview Thus there often exits strong correlation between interview assessment, where different types of interview assessment are assessments and resumes. However, job description is usu- generated from different topics ϕE ∈ {ϕET , ϕEC }. To sim- ally more abstract than resumes, and candidates with different plify our model, we follow the idea in [Wang and McCallum, backgrounds may be suitable for the same job. Thus we think 2006], and set the interview label for each word in interview although there exists correlation between job descriptions and assessment instead of the entire interview assessment. resumes, the diversity of job descriptions is less than that of The graphical model of JLMIA is shown in Figure 1. Since resumes. In addition, it is obvious that the focus of interviews the generative process of job description is the same as La- is different according their goals. Thus it is better to model the tent Dirichlet Allocation (LDA) [Blei et al., 2003], here we differences between technical and comprehensive interview. only list the generative process for resume and interview as- Generally, the main tasks in this paper can be summarized |M | as: Task 1, how to discover the strong correlation between re- sessment A = {Am }m=1 , showed in Algorithm 1, where sumes and interview assessments? Task 2, how to model the h(θ, C), in line 3.(a), is a vector concatenating C log vec- J J 0 tors of θ, i.e., h(θm , C)k = logθm,k 0 , k = k mod K, 1 ≤ latent relationships between job descriptions and resumes? Task 3, how to distinguish the differences between different k 0 ≤ K, and π(θ), in line 3.(b).i and 3.(c).i, is the logistic A interview categories? A exp{θmd,k } transformation, i.e., π(θmd )k = CK . A P exp{θmd,i } 3 Technical Details of JLMIA i=1 Due to the non-conjugacy of the logistic normal and multi- To solve the above tasks, we propose a novel joint learning nomial, the latent parameters posterior is intractable. Thus we model, namely JLMIA. In this section, we will formally intro- propose a variational inference algorithm for JLMIA. duce its technical details. 3.1 Model Formulation 3.2 Variational Inference for JLMIA To model the latent semantics in job description, resume, Here, we develop a variational inference algorithm for and interview assessment, we assume there exist latent top- JLMIA based on mean-field variational families. The basic ics, represented by ϕJ , ϕR and ϕE , in all of them. And idea behind variational inference is to optimize the free para- 3543
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) meters of a distribution over the latent variables, so that the 4.1 Person-Job Fit distribution is close in Kullback-Liebler (KL) divergence to Person-Job Fit is the process of matching the right talent for true posterior, which can be substituted. In our model, let the right job. Formally, given a job description Jg and a re- us denote all latent variable parameters by Φ and all hyper- sume Rg , the objective is to measure their matching degree. parameters by Ω. Following the generative process, the joint Specifically, we need to first leverage JLMIA to infer the la- distribution can be factored as: tent topic distributions of Jg and Rg respectively. However, |M | Y our model cannot infer the topic distribution for an individ- p(S, Φ|Ω) = p(Φ|Ω) P (Sm |Φ), (1) ual resume or job description. Thus we construct a S for re- m=1 sumes (job descriptions), where all of other data are set as where each component can be calculated by: empty, and infer the corresponding topic distribution. |Dm | J p(Sm |Φ) = p(Jm |zm , ϕ ) J Y R R p(Rmd |zmd , ϕ )p(Emd |zmd , ϕ , Imd ), E E After the variational parameters of topic assignment of d=1 each word, φJg and φR g , are learned, we can compute the |M | |Dm | Y Y R Nmd E Nmd document-topic distribution by: A J 2 Y R A Y E A p(Φ|Ω) = p(θmd |θm , δ ) p(zmdr |θmd ) p(zmde |θmd ) N J J 1 X J m=1 d=1 r=1 e=1 θg,k = φ k = 1, · · · , K, N J n=1 gn,k K Y CK J J Y R R ET EC E × p(ϕk |β ) p(ϕk |β )p(ϕk , ϕk |β ) N R k=1 k=1 R 1 X R θg,k = φ k = 1, · · · , CK. (5) J N R n=1 gn,k |M | Nm Then, by computing the similarity between θgJ and θgR , we Y J Y J J × p(θm |α) p(zmj |θm ). m=1 j=1 Then, corresponding to this joint distribution, we posit the can measure the suitability between job description and re- fully factorized variational families as following, where the sume. Actually, any distance calculation formulation between detail description of each term can be found in Appendix: two probability distributions can be used here for measuring K CK M J Nm the similarity, such as Cosine distance and Kullback-Leibler divergence. Note that, since the dimension of θgJ and θgR may Y J Y R ET EC Y J Y J q(Φ) = q(ϕk ) q(ϕk )q(ϕk )q(ϕk ) q(θm ) q(zmj ) k=1 k=1 m=1 j=1 be different, here we have: R Nmd E Nmd R X R M D Y Y m CK Y θ̃g,k = θg,c k = 1, · · · , K, A Y R Y E × q(θmd,k ) q(zmdr ) q(zmde ). (2) c∈Ck m=1 d=1 k=1 r=1 e=1 R According to [Blei et al., 2017], minimizing the KL diver- where Ck is a set of mapping index that satisfies θg,c J gence between variational distribution and true posterior, is (c ∈ Ck ) is generated from θg,k . In particular, the vectors equivalent to maximize the log likelihood bound of job inter- θgJ and θgR learned by JLMIA can be regarded as low-rank se- view records, which is the evidence lower bound (ELBO): mantic representations of job description and resume. Thus, log p(S|Ω) ≥ Eq [log p(S, Φ|Ω)] + H(q) using these representations instead of original bag-of-words |M | X as features for training a classifier (e.g., Random Forest) is = Eq [log p(Φ|Ω)] + Eq [log p(Sm |Φ)] + H(q), (3) another solution for Person-Job Fit. m=1 where the expectation Eq [·] is taken with respect to the vari- 4.2 Interview Question Recommendation ational distribution in Equation 2, and H(q) denotes the en- During the interview, interviewers need to ask some questions tropy of that distribution. to evaluate candidates. However, due to the limited expert re- The largest challenge to maximize ELBO is the non- sources, sometimes the interviewers may not have enough conjugacy of logistic normal and multinomial, which leads domain knowledge to prepare discriminative questions for to the difficulty in computing the excepted log probability systematically judging the competencies of candidates, espe- of topic assignments in documents of each candidate inter- cially from the view of Person-Job Fit.Thus, in this paper, we view records. Similar to [Wang and Blei, 2011], we intro- propose an effective algorithm for recommending interview duce a new variational parameter ζ = {ζm1:|Dm | }m=1:|M | questions based on JLMIA and interview questions accumu- to preserve the lower bound of ELBO. Here we take lated in historical assessments. R the Eq [logp(zmdr |θA )] as an example to explain it (the To be specific, given a question database Q = {qi }N E A i=1 , the Eq [logp(zmde |θ )] can be computed in a similar way): problem of interview question recommendation is defined as CK R A Eq [logp(zmdr |θmd )] = Eq [θmd,zR A ] − Eq [log( X A exp{θmd,k })] retrieving a set of questions X that are related to a given query mdr k=1 Υ (i.e., job requirement item or experience item of candidate). A −1 CK X A Similar to the process of computing the topic distributions θgR , ≥ Eq [θmd,zR ] − ζmd ( Eq [exp{θmd,k }]) + 1 − log(ζmd ). (4) mdr k=1 we can compute the topic distribution θiQ of each question For maximizing the ELBO, we develop an EM-style al- qi ∈ Q, through regarding interview questions as a part of gorithm with coordinate ascent approach to optimize para- interview assessment. Let θiQ denote a latent representation meters, the details of which can be found in Appendix. of question qi and the latent representation of the given query Υ as θgΥ ∈ {θgJ , θgR }. 4 Application To recommend high quality questions to interviewers, on Here, we will introduce two applications enabled by JLMIA, the one hand, the selected question set X ⊂ Q, |X| = L i.e., Person-Job Fit and Interview Question Recommendation. should be relevant to the query θgΥ , on the other hand, we hope 3544
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) to avoid making the questions in X too similar to each other. Job Description Resume Tech. Interview Com. Interview To balance relevance and diversity of selected question set, Topic 1 Topic 1 Topic 1 Topic 1 we select questions X by maximizing the following objective Experience Function Foundation Technology function: Foundation Management Knowledge Communication Technology Backstage Code Study Rel(Υ, X) Div(X) F (Υ, X) = µ + (1 − µ) PHP HTTP Element Knowledge Rel Div Web Moudle Development Development P Υ Q P P Q Q Sim(θg , θj ) Dis(θi , θj ) Proficient Topic 11 Topic 11 Topic 11 qj ∈X qi ∈X qj ∈X,qj 6=qi =µ + (1 − µ) , Engineer Web Site JS Job Rel Div Interest Web Page Methods Pressure s.t. X ⊂ Q, |X| = L, 0 < µ < 1, (6) Development System CSS Solution Web Page Web Elements Like where, Rel(Υ, X) and Div(X) measure the relevance and Maintenance Framework Events Work Overtime diversity above, Sim(∗, ∗) is chosen as Cosine(∗, ∗) while Table 1: Topic example of JLMIA Dis(∗, ∗) is set as 1−Cosine(∗, ∗), and Rel and Div are nor- malization factors, commonly chosen as the maximum possi- from topic #1 of job description, contain different keywords, ble values of Rel(Υ, X) and Div(X) respectively. which validate the assumption that the diversity of job des- cription is less than that of resume. Meanwhile, compared In general, the calculation of addressing F (X) is compu- with technical interview assessment, there are more keywords tationally prohibitive, since we will suffer the assemble ex- like “Communication”, “Pressure” or “Work Overtime” in plosion problem if we calculate Equation 6 for all subsets comprehensive interview assessment, which are related to the X. Fortunately, since F (X) defined in this paper is submod- evaluation of personal qualities. ule [Tang et al., 2014], the simple greedy algorithm could achieve a (1 − 1/e) approximation of the optimal solution, 5.3 Performance of Person-Job Fit Here, we evaluate the performance of JLMIA in terms of 5 Experimental Results Person-Job Fit. Specifically, given a job description and a re- sume, we treat their latent topic distributions learned by our In this section, we will introduce the performance of JLMIA model as their representation vectors. Then, we train classic based on a real-world interview data set. classifiers to predict the matching degree between the job and 5.1 Experimental Setup the candidate. Besides, to further demonstrate the effective- ness of our model, we also use the similarities between their The data set used in the experiments is the historical recruit- representation vectors for measuring Person-Job Fit. ment data provided by a high-tech company in China, which contains total 14,702 candidate interview records. To be spe- Benchmark Methods cific, with the help of several staffing experts, we manually We selected Latent Dirichlet Allocation (LDA) and bag-of- screened records with high quality interview assessment writ- words (BOW) vector representation as baselines. For LDA, ten by senior interviewers, and removed the records which we merged the resume of a candidate and the job she applied lack details in job description or resume. After that, the fil- for as a document for learning the latent topics. And for bag- tered data set contains 4,816 candidate interview records re- of-words, where the i-th dimension of each vector is the fre- lated to 409 job positions. In JLMIA, we empirically set fixed quency of the i-th word of the vocabulary, it itself is a kind of parameters {δ 2 , β J , β R .β E } = {0.01, 0.1, 0.1, 0.1}. Note representation. Due to the limited space and similar trends of that, our model is trained with original Chinese words. And results, here we only selected Cosine and Kullback-Leibler for facilitating demonstration, all experimental results were similarity based approaches, and selected Random Forests translated into English. and GBDT as classifiers. Please note that because the simi- larity between two BOW vectors is meaningless, we did not 5.2 Evaluation of Topic Joint Learning treat it as a baseline here. To evaluate the effectiveness of joint learned topics by Data Preparation JLMIA, we first trained our model on all successful job inter- Different from the similarity based approaches, only one type view records. In particular, we set the parameters K = 10 and of samples, i.e., positive samples, is required, the classifier C = 2. Table 1 shows one randomly selected latent topic of based approaches need to prepare unsuitable pairs of job des- job description and corresponding topics of resume and two cription and resume as negative samples to train classifiers. types of interview assessments. Each topic is represented by Although we can intuitively regard the historical failed job several words with the highest probability. applications as negative samples, we do not know the exact We can observe that the topic of job description, containing reasons behind these failures. For example, some failed ap- “Experience” and “Foundation” of “PHP” and “Web”, should plications are just due to the low pay benefits, or other similar be related to web development. Similarly, the corresponding reasons in offer negotiation. Therefore, we manually gener- topics of resume and technical interview assessment also con- ated the same number of negative samples to train classifier tain front-end-related keywords, “HTTP”, “Web Site”, “Ele- by randomly selecting resumes and job descriptions from the ment”,“JS” and “CSS”, which indicate the professional skills successful job interview records. Along this line, the experi- of candidates. Thus we believe that our model can effectively ments will only focus on the representation of latent topics, reveal the latent relationship among job description, resume while interference from other factors will be impaired. After and interview assessment. More interestingly, we can find that that, we randomly selected 80% data for model training and topic #1 and topic #11 of resume, which are both generated the other 20% data for test. 3545
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) Relevance Diversity Personal Quality JLMIA-TI 8.06 2.90 2.17 0.80 0.80 JLMIA-CI 7.72 2.84 3.22 ROC AUC 0.75 0.75 PR AUC 0.70 0.70 BM25 7.14 1.67 1.00 0.65 0.65 0.60 0.60 0.55 0.55 Table 3: The question recommendation performance of JLMIA and 10 20 4 5 10 20 4 5 BM25 with 10 questions recommended K30 3 C K30 3 C 40 2 40 2 50 1 50 1 Given experience (a) ROC AUCs (b) PR AUCs item I am familiar with HTML and CSS programming, and have some web development experience. Figure 2: The Person-Job Fit performance of JLMIA based on T1. What are Ajax and Interactive Model? What are the differences Questions between Synchronous and Asynchronous requests? How to solve Cosine similarity and different parameters. recommended by Cross-domain issues? JLMIA for technical interview T2. What are the meanings of Graceful Degradation and Progres- ROC AUC PR AUC sive Enhancement? T3. How to make text centered vertically by CSS programming. JLMIA 0.8279 0.7935 Cosine Similarity T4. What is the role of the HTTP status code? LDA 0.7026 0.7223 Kullback-Leibler JLMIA 0.8234 0.8094 Questions C1. Talk about OSI, TCP / IP and Five-layers Network Model. recommended by C2. What are the differences between HTML and XHTML? Divergence LDA 0.6589 0.6579 JLMIA for comprehensive C3. Do you think finding a job is not easy for you? JLMIA 0.9012 0.8975 Random Forest interview C4. What are the differences between Scrollbar and JScrollPane? LDA 0.7359 0.7341 (n estimators=400) B1. What are web applications? BOW 0.6716 0.6761 Questions B2. Talk about your understanding of the semantics of HTML. GBDT JLMIA 0.8564 0.8311 recommended by BM25 B3. Please program a read-write lock with a normal mutex. (n estimators=100, LDA 0.7092 0.6810 B4. Talk about your understanding of the web standards and W3C. max depth=9) BOW 0.6531 0.6723 Table 2: The Person-Job Fit performance of different approaches. Table 4: The case study of question recommendation. Performance Analysis BM25. Then, we asked 3 senior interviewers to evaluate To evaluate the parameter sensitivity, we trained JLMIA by the performance of recommendation questions. They were varying the parameter K from 10 to 50, and the parameter first required to judge which questions are relevant to this C from 1 to 5. The person-job fit performance of JLMIA query, where the number of relevant questions is the rele- based on Cosine similarity and different parameters is shown vance measure. Then, they needed to judge how many dif- in Figure 2(a) and 2(b). We can find that the Receiver Oper- ferent technical aspects mentioned in those relevant ques- ating Characteristic (ROC) AUCs and Precision-Recall (PR) tions, which is diversity measure, and how many questions AUCs are both better with small K, and reach the highest are about personal quality, which should be different be- with K = 10 and C = 2. Therefore, we chose the best para- tween technical interview (TI) and comprehensive interview meters K and C for the following experiments. Similarly, we (CI). As the average results shown in Table 3, we can find also evaluated LDA model with different topic number para- that compared with traditional keywords matching based ap- meters K, and chose K = 30 for other experiments. proach BM25, JLMIA can recommend questions with more Table 2 shows the Person-Job Fit performances of JLMIA relevance and diversity. Meanwhile, JLMIA also can rec- and baselines. From the results, we find that our model con- ommend more questions related to personal qualities, espe- sistently outperforms other baselines in both similarity based cially, the number of personal quality questions for compre- approaches and classifier based approaches. It indicates that hensive interview is more than technical interview, which dis- JLMIA can effectively capture the latent relationship between tinguishes the different focuses of them two. job description and resume. More interestingly, the perfor- Further more, to illustrate the effectiveness of our question mances of JLMIA in similarity based approach is also higher recommendation approach, we also show an example of top than most of baselines. It clearly demonstrates the effective- 4 questions recommended by different approaches in Table 4. ness of the representation learned by JLMIA. Obviously, the given experience item is about web develop- ment. We find questions recommended by JLMIA contain all 5.4 Performance of Question Recommendation technical aspects mentioned in experience item (e.g., T1, T2 To evaluate the performance of interview question recom- and T3 is about “CSS and HTML programming”, and C4 is mendation of JLMIA. we first collected 1,085 interview ques- about “web development”). Also, JLMIA recommends ques- tions as the candidate set from historical interview assess- tions designed for “HTTP” (i.e., T4 and C1), which is useful ments, and then, compared JLMIA (K = 10 and C = 2) knowledge for web developers. Second, for the comprehen- with BM25, a classic information retrieval model based on sive interview, JLMIA also recommended questions to eval- keywords matching which ignores the latent relationship be- uate the personal qualities of candidates, such as C2, which tween queries and questions. In our algorithm, the parameters is related to the communication ability and problem analysis are empirically set as Rel = 5, Div = 20 and µ = 0.9. ability. Last, for the questions recommended by BM25, since We randomly selected 100 experience items as the queries. they must have the same words in given requirement, the For each query, we recommend 10 questions by JLMIA and semantic relationship between keywords are neglected (e.g., 3546
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) “HTTP” and “web”). Thus, the recommended questions by Acknowledgements BM25 do not contain more technical details. This work was partially supported by grants from the National Natural Science Foundation of China (Grant 6 Related Work No.91746301, 61727809, 61703386). Recruitment Analysis. With the importance of talents at an all time high and the availability of recruitment big data, re- A EM Algorithm of Variational Inference cruitment analysis has been attracting more and more atten- In this appendix we give some details of the EM-style algo- tions [Xu et al., 2016; Zhu et al., 2016]. As early as 2006, rithm of variational inference outlined in section 3.2 Malinowski et al. tried to find a good match between tal- First of all, we define each variational distribution term ents and jobs by two distinct recommendation systems [Ma- of the variational families in Equation 2. To be specific, the linowski et al., 2006]. In 2011, Paparrizos et al. exploited all J variational distribution of each topic proportion vector θm is historical job transitions as well as the data associated with J Dirichlet parameterized by vector γm . The variational distri- employees and institutions to predict the next job transition A bution of θmd,k , the k-th dimension of topic proportion vector of employees [Paparrizos et al., 2011]. Recently, besides the A θmd , is univariate Gaussians {γmd,k A , δ 2 }. The variational dis- match of talents and jobs [Rácz et al., 2016], researchers are J R E tribution of zmj , zmdr and zmde are specified by free Multi- also devoted to analyze recruitment market from more novel perspective, such as market trend analysis [Zhu et al., 2016; nomial with parameters φJmj,1:K , φR E mdr,1:CK and φmde,1:CK Lin et al., 2017], career development analysis [Li et al., respectively. The variational distribution of ϕJk , ϕR ET k , ϕk 2017], talent circles [Xu et al., 2016] and popularity mea- EC J R and ϕk are Dirithlet parameterized by λk,1:|V J | , λk,1:|V R | , sure of job skills [Xu et al., 2018]. Although the above stud- λET EC J R E ies have explored different research aspects of recruitment k,1:|V E | and λk,1:|V E | , where |V |, |V | and |V | are the market, few of them are developed for enhancing the quality lengths of vocabularies of job description, resume and inter- and experience of job interviews. To this end, in this paper, view assessment, respectively. Actually, we find each term of ELBO in JLMIA is we proposed a novel approach for intelligent job interview similar to some parts of ELBO in LDA model [Blei et assessment by joint learning of multiple perspectives from al., 2003] or CTM model [Wang and Blei, 2011], except large-scale real-world interview data. A Eq [logp(θmd |θJ , δ 2 )], which can be computed by: Text Mining with Topic Model. Probabilistic topic mod- A J 2 A J 2 els are capable of grouping semantic coherent words into hu- Eq [log p(θmd |θ , δ )] = Eq [log N (θmd |h(θm , C), δ I)] = man interpretable topics. As an important member of archety- CK 2 1 CK X A J 2 − (log δ + log 2π) − Eq [(θmd,k − log θm,k0 ) ], pal topic models, Latent Dirichlet Allocation (LDA) [Blei 2 2δ 2 k=1 et al., 2003] has a lot of extensions [Zhu et al., 2014; A J 2 2 0 Eq [(θmd,k − log θm,k0 ) ] = δ + Ψ (γm,k0 ) − Ψ (|γm,1:K |) J 0 J Mimno et al., 2009; Pyo et al., 2015], etc.. Among them, A J J 2 +(γmd,k − Ψ(γm,k0 ) + Ψ(|γm,1:K |)) , some works focus on modeling shared latent topic distribu- tion among multiple categories of documents, and have a K J J , and k 0 = P wide range of practical applications. For example, Mimno et where we assume that |γm,1:K | = γm,k i=1 al. [Mimno et al., 2009] designed a polylingual topic model k mod K. Similar symbols are not described later for simplic- that discovers topics aligned across multiple languages. Pyo ity. And the Ψ(·) is Digamma function with derivative Ψ0 (·). et al. [Pyo et al., 2015] proposed a novel model to learn Then, we describe our EM-style algorithm. In E-step, we the shared topic distribution between users and TV programs employ coordinate ascent approach to optimize all variational for TV program recommendation. Different from existing re- parameters. First, we optimize the ζmd in Equation 4: search efforts, in this paper we developed a novel model CK X A 2 ζ̂md = exp{γmd,k + δ /2}. JLMIA to jointly model job description, candidate resume k=1 and interview assessment. Second, we optimize φJmj,1:K , φR E mdr,1:CK and φmde,1:CK J R 7 Concluding Remarks for each coordinate. Assume that wmj = c, wmdr = t and E wmde = i, Imde = T I: In this paper, we proposed a novel approach for intelligent job J J J J J interview assessment by learning the large-scale real-world φ̂mj,k ∝ exp{Ψ(λk,c ) − Ψ(|λk,1:|V J | |) + Ψ(γm,k ) − Ψ(|γm,1:K |)}, interview data. To be specific, we first developed a latent vari- R R φ̂mdr,k ∝ exp{Ψ(λk,t ) − Ψ(|λk,1:|V R | |) + γmd,k }, R A able model JLMIA to jointly model job description, candi- E ET φ̂mde,k ∝ exp{Ψ(λk,i ) − Ψ(|λk,1:|V E | |) + γmd,k }. ET A date resume and interview assessment. JLMIA can effectively learn the representative perspectives of different job interview J Third, we optimize γm . Due to no analytic solution, we use processes from the successful job interview records in history. Newton’s method for each coordinate: Furthermore, we exploited JLMIA for two real-world appli- dELBO 1 X Dm X CK J J A =− 2 2(Ψ(γm,k0 ) − Ψ(|γm,1:K |) − γmd,k ) cations, namely person-job fit and interview question recom- J dγm,i 2δ d=1 k=1 mendation. Extensive experiments conducted on real-world i 0 J 0 J i 00 J 00 J ×(δk0 Ψ (γm,k0 ) − Ψ (|γm,1:K |)) + δk0 Ψ (γm,k0 ) − Ψ (|γm,1:K |) data clearly validate the effectiveness of JLMIA, which can lead to substantially less bias in job interviews and provide a K X J J i 0 J 0 J + (|φm1:N J | + αk − γm,k )(δk Ψ (γm,k ) − Ψ (|γm,1:K |)), valuable understanding of job interview assessment. k=1 m ,k 3547
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) where function δxy = 1, only if x = y, otherwise, δxy = 0. [Paparrizos et al., 2011] Ioannis Paparrizos, B Barla Cam- A bazoglu, and Aristides Gionis. Machine learned job rec- Fourth, we optimize γmd,1:CK . Due to no analytic solution, again, we use conjugate gradient algorithm with derivative: ommendation. In Proceedings of the fifth ACM conference on Recommender systems, pages 325–328. ACM, 2011. dELBO 1 A J J R A dγmd,k = − 2 γmd,k − Ψ(γm,k0 ) + Ψ(|γm,1:K |) + |φmd1:N R ,k | δ [Pyo et al., 2015] Shinjee Pyo, Eunhui Kim, et al. Lda-based md E R E −1 A 2 unified topic modeling for similar tv user grouping and tv +|φmd1:N E | − (Nmd + Nmd )ζmd exp{γmd,k + δ /2}. md ,k program recommendation. IEEE transactions on cyber- netics, 45(8):1476–1490, 2015. Last, we optimize λJ , λR , λET and λEC . Their calculation [Rácz et al., 2016] Gábor Rácz, Attila Sali, and Klaus Di- process are similar, token λJk,c and λET k,i as examples: eter Schewe. Semantic Matching Strategies for Job Re- M N m J cruitment: A Comparison of New and Known Approaches. J J X X J c λk,c = βc + φmj,k δwJ , Springer International Publishing, 2016. mj m=1 j=1 [Tang et al., 2014] Fangshuang Tang, Qi Liu, Hengshu Zhu, M D NE ET E X X m X md J i TI Enhong Chen, and Feida Zhu. Diversified social influence λk,i = βi + φmde,k δwE δI . m=1 d=1 e=1 mde mde maximization. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International In the M-step, we maximize the ELBO with respect to Conference on, pages 455–459. IEEE, 2014. parameter α, similar to LDA, and regard the other hyper- [Wang and Blei, 2011] Chong Wang and David M. Blei. parameters in Ω as fixed parameters. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD Inter- References national Conference on Knowledge Discovery and Data [Bersin, 2013] Josh Bersin. https://www.forbes.com/sites/ Mining, KDD ’11, pages 448–456, New York, NY, USA, joshbersin/2013/05/23/corporate-recruitment- 2011. ACM. transformed-new-breed-of-service-providers/. 2013. [Wang and McCallum, 2006] Xuerui Wang and Andrew Mc- [Blei et al., 2003] David M. Blei, Andrew Y. Ng, and Callum. Topics over time: a non-markov continuous-time model of topical trends. In Proceedings of the 12th ACM Michael I. Jordan. Latent dirichlet allocation. J. Mach. SIGKDD international conference on Knowledge discov- Learn. Res., 3:993–1022, March 2003. ery and data mining, pages 424–433. ACM, 2006. [Blei et al., 2017] David M Blei, Alp Kucukelbir, and Jon D [Xu et al., 2016] Huang Xu, Zhiwen Yu, Jingyuan Yang, Hui McAuliffe. Variational inference: A review for statisti- Xiong, and Hengshu Zhu. Talent circle detection in job cians. Journal of the American Statistical Association, transition networks. In Proceedings of the 22nd ACM (just-accepted), 2017. SIGKDD International Conference on Knowledge Discov- [Li et al., 2017] Huayu Li, Yong Ge, Hengshu Zhu, Hui ery and Data Mining, pages 655–664. ACM, 2016. Xiong, and Hongke Zhao. Prospecting the career devel- [Xu et al., 2018] Tong Xu, Hengshu Zhu, Chen Zhu, Pan Li, opment of talents: A survival analysis perspective. In Pro- and Hui Xiong. Measuring the popularity of job skills in ceedings of the 23rd ACM SIGKDD International Confer- recruitment market: A multi-criteria approach. In Proceed- ence on Knowledge Discovery and Data Mining, Halifax, ings of the Thirty-Second AAAI Conference on Artificial NS, Canada, August 13 - 17, 2017, pages 917–925, 2017. Intelligence, February 2-7, 2018, New Orleans, Louisiana, [Lin et al., 2017] Hao Lin, Hengshu Zhu, Yuan Zuo, Chen USA., 2018. Zhu, Junjie Wu, and Hui Xiong. Collaborative company [Zhang et al., 2014] Yingya Zhang, Cheng Yang, and Zhixi- profiling: Insights from an employee’s perspective. In Pro- ang Niu. A research of job recommendation system based ceedings of the Thirty-First AAAI Conference on Artificial on collaborative filtering. In Computational Intelligence Intelligence, February 4-9, 2017, San Francisco, Califor- and Design (ISCID), 2014 Seventh International Sympo- nia, USA., pages 1417–1423, 2017. sium on, volume 1, pages 533–538. IEEE, 2014. [Malinowski et al., 2006] Jochen Malinowski, Tobias Keim, [Zhu et al., 2014] Chen Zhu, Hengshu Zhu, Yong Ge, En- Oliver Wendt, and Tim Weitzel. Matching people and jobs: hong Chen, and Qi Liu. Tracking the evolution of so- A bilateral recommendation approach. In System Sciences, cial emotions: A time-aware topic modeling perspective. 2006. HICSS’06. Proceedings of the 39th Annual Hawaii In 2014 IEEE International Conference on Data Mining, International Conference on, volume 6, pages 137c–137c. ICDM 2014, Shenzhen, China, December 14-17, 2014, IEEE, 2006. pages 697–706, 2014. [Mimno et al., 2009] David Mimno, Hanna M Wallach, Ja- [Zhu et al., 2016] Chen Zhu, Hengshu Zhu, Hui Xiong, son Naradowsky, David A Smith, and Andrew McCallum. Pengliang Ding, and Fang Xie. Recruitment market trend Polylingual topic models. In Proceedings of the 2009 Con- analysis with sequential latent variable models. In Pro- ference on Empirical Methods in Natural Language Pro- ceedings of the 22nd ACM SIGKDD International Con- cessing: Volume 2-Volume 2, pages 880–889. Association ference on Knowledge Discovery and Data Mining, pages for Computational Linguistics, 2009. 383–392. ACM, 2016. 3548
You can also read