Julia, a new programming language released in early 2012 was developed by computer scientists Stefan Karpinski and Jeff Bezanson with the objective of making use of one language to deal with any computing task. Existing programming languages were each designed with different end objectives. For example, R is built specifically to provide ease of running statistical algorithms for researchers; Matlab is great for linear algebra computations and matrix calculations whereas C and Java are known for their ability to run codes faster. Sometimes to execute any particular computing task, programmers often need more than one language which makes it difficult to code, compile and execute any given task using different syntaxes and coding styles.
In order to solve these existing problems, Julia was created as a free, open source and library friendly language. It is commonly referred to as a high-level, high-performance dynamic programming language for technical computing, with syntax similar to other technical computing environments. It also contains many of the mathematical and statistical libraries as in-built functions which are commonly part of any high performance environment. In addition, the Julia developer community contributes various external libraries and packages through Julia’s built-in package manager.
Primarily Julia is built for speed and applications using it rather than Python or R have been found to have very fast running times. Much of the Julia’s speed is attributed to its LLVM-based just-in-time (JIT) compiler which often matches the performance of C. It also provides support to perform advance tasks such as cloud computing and parallelism which are more fundamental to performing big data analytics. Let us now look at a small problem which is used to benchmark the performance of Julia, R, Python and some of its faster versions like pqR. The problem is to calculate the smallest number that is divisible by all of the numbers in a factorial. For example, for the numbers in 4!, 60 is the smallest number that is divisible by 2, 3, 4 and 5. Benchmark comparison results are showed in the image above, which clearly showcases the faster computation times of Julia for various factorials.
Julia for Big Data Analytics?
For big data analytics tasks, right now the most popular language choices are R and Python in the open source community. R has been traditionally popular for its vast set of statistical packages and Python is more a general purpose language which is seeing increasing adoption by big data community in recent years due to its ease of usage and flexibility. Julia is majorly created for technical computing and with provision of advanced features like distributed computation; it surely can be easily adopted to big data analytics tasks. And surely performance wise, Julia has an additional advantage of quick run times because of its internal robust architecture in comparison with R and Python. However, at this point Julia lacks the availability of external packages and libraries with specific applications relevant to big data analytics which would limit its immediate adoption for various big data tasks. Given that the big data industry is constantly changing with the addition of new frameworks and technologies every six months, it may not be surprising to see Julia picking up more adoption along the lines of R and Python in the not so distant future.