NP or Not NP

That is the question.  While many problems in computer science have been classified as P or NP-complete problems there is a whole other group of problems that computer scientists haven’t been able to place into each category yet.  Once such problem is known as the “graph isomorphism” problem.  This question posed in this problem is whether two networks that look different are actually the same.  In class we talked about the significance of the P vs. NP question.  While this problem does not have quite the same widespread impact, it had been an open problem in computer science for quite some time now.

The previous best solution for this problem was introduced in 1983. But, in December 2015 professor Laszlo Babai, of the University of Chicago, introduced a new solution that was more efficient than the previous one.  He proceeded to later retract his claim after a flaw in his algorithm was pointed out, but on January 9th, 2017 he announced that he had fixed the error and reintroduced his algorithm.  Both his solution and the previous best one exist in the odd space between P and NP-complete problems.  The previous solution having a complexity classified as “subexponential” and Babai’s new solution being classified.  The question of how to classify the graph isomorphism remains open but as more efficient algorithms continue to be found, many computer scientists are of the opinion that it will eventually be found that it belongs in P.

So what is the significance of this result?  Well a very important distinction needs to be made.  Graph isomorphism is know to be an NP problem but it is not know whether it is NP-complete.  Therefore even if a polynomial solution is found, it has no significant effect on the P vs. NP question, at least for now.  This doesn’t make it is totally insignificant though.  This problem does have the property of being a universal one.  This means that any problem asking whether two “combinatorial structures” are isomorphic can be somehow thought of as a graph isomorphism problem.  One such example is the question of whether two sudoku puzzles are the same.  It may not lead directly to discovering a factorization algorithm that runs in polynomial time but it is certainly an exciting result in the field of computer science.


The History of R

In class, Prof. Schedlbauer mentioned that the R language got its name from the language it was created to be an alternative to, S.  We also discussed the fact that R has many “quirks” (or flaws depending on who you ask) due to it being created completely by open-source contributors.  I have always found the history of programming languages interesting and enjoy following a language from its roots to the version of it that is in use today and these conversations about R made me realize that I knew very little about its history so I did some research and thought I should share it here.

R was originally created at the University of Auckland, New Zealand by a pair of professors, Ross Ihaka and Robert Gentleman, to aid undergraduate students in doing data analysis.  While it’s name is partially a tongue-in-cheek reference to the S language that much of its functionality was inspired by, it is also a reference to the first letter of both of it’s creators first names.  According to this article, conversations between the two about developing the tool began began in the early 1990’s with work on it starting in 1991.  Neither of the languages creators ever anticipated it becoming the widely used tool that it is today.  Instead, they hoped that it may be used to help teach intro level statistics classes.

R’s first open source release came in 1995 and in 2000 R 1.0.0, the first official version, was released.  Since then, the language has grown to serve an incredibly large user base, although the exact size of that user base is extremely difficult if not impossible to determine.  One way to measure the popularity of R is to look at contributions to its open source package repository, CRAN.  This was done by one R user who created the following graph of the number of published R packages on CRAN (yes, he did it using R).


This exponential growth can most likely be attributed in part to the recent growth in data science related fields as more and more users find R to be the tool they need and look for ways to improve it.  It’s quite possible that after our introduction to R in this class, we will become contributors to this rapidly growing community.