Thursday, February 18, 2016

So what papers are IJCAI-16 PC members most interested in reading (based on the bid popularity)?

One of the unsaid things about a large conference like IJCAI is that the quality of reviews a paper gets is critically correlated with the number of bids that the paper gets.

When a paper doesn't get enough bids, it needs to be matched manually to PC members--something that is very much error-prone when we are talking about 2000 papers and 1700+ reviewers (and the ever condensed reviewing time).

Having just completed making some 10,000+ reviewer assignments using 53,000 reviewer bids, I am struck not just by the well known long tail phenomena in the number of bids papers get, but also how many papers get almost "shutout" (get zero bids).

So we thought it would be fund to do an analysis on what papers are getting a lot of bids (based on a word cloud analysis of the paper titles).

52892 bids were made by 1723 program committee members over 2000 papers in the main track(or an average of 25 bids/paper and 30 bids/PC member). Here is how they were distributed

So, we naturally wanted to find out which papers are getting more vs. fewer bids. Numbering the bins 0 (for <=3 combined bids), 1 (for 4-10 combined bids) etc, we made word clouds for papers in each of the bins (based on the titles). Thus the cloud (helpfully shaped) 8 has the words occurring in the papers getting  >70 bids ;-)

Here then are the bins from the "least bids (0)" to "most bids (8)"

So there you go... I have my own interpretation of this data, but I would rather hear your interpretations ;-)

(with all the help from Lydia Manikonda-- IJCAI-16 Data Scientist)

1 comment:

  1. I see lots of "based", which is usually hyphenated with something else. Perhaps it would have been better to keep hyphenated terms as a single word. Any idea of what most often goes with "-based"?