Monday, June 13, 2016

Spark 2.0 is out


Spark Summit East Keynote: Apache Spark 2.0


How do you get your hands on Spark 2.0 :-
1. Databricks Community Edition
2. Download and set it up


Major features:-
  1.  Tungsten Phase 2 speedups of 5-10x
  2. Structured Streaming real-time engine on SQL/DataFrames
  3. Unifying Datasets and DataFrames








Thursday, June 9, 2016

Running your first R notebook on IBM Bluemix Apache Spark Service

Running your first R notebook on IBM Bluemix Apache Spark Service

IBM Bluemix Apache Spark Service have introduced R -tech preview for allowing users to run R programs on spark cluster.
https://developer.ibm.com/clouddataservices/docs/spark/technical-previews/r-in-jupyter-notebooks/
So how do you get yourself started on R notebook on Spark.

You would need to create new instance of the service as tech preview was introduced in May 2016. Please check it out.
I have a simple example of PI Calculator here, if you just want to import and give the service a try:- https://github.com/charles2588/bluemixsparknotebooks/raw/master/R/Pi_Bluemix.ipynb

Free Beta Data Science Tools with Spark

Below are the links to beta programs / community editions to allow to test your spark programs on spark servers without having to setup anything.

IBM

Sign Up for IBM Data Science Experience. Beta wait-list.
http://datascience.ibm.com/


Databricks

Sign up for Community Edition
This gives you free spark instance. Beta wait-list.

https://databricks.com/try-databricks