Connecting to MongoDB from IBM Bluemix - Juypter Notebooks on Spark
- Create an account in bluemix(ibm offers 30 days free trial) - https://console.ng.bluemix.net/registration/
- Create a spark service (https://www.ng.bluemix.net/docs/services/AnalyticsforApacheSpark/index.html)
- Now create notebook with scala as language.
- Add unityJDBC jar which has mongodb driver.
%Addjar https://github.com/charles2588/SparkNotebooksJars/raw/master/unityjdbc.jar
- Add Mongo Java Driver jar which unityJDBC need
%Addjar https://github.com/charles2588/SparkNotebooksJars/raw/master/mongo-java-driver-2.13.3.jar
- Test below import
import mongodb.jdbc.MongoDriver
- Import the two classes SparkConf and SparkContext
import org.apache.spark.sql.{DataFrame, SQLContext}
-
Simply replace url with your mongodb url.
dbtable with name of the table for which you want to create dataframe.
replace user and password for your db2 database server.
val url = "jdbc:mongo://ds045252.mlab.com:45252/samplemongodb" val dbtable = "Photos" val user = "charles2588" val password = "*****" val options = scala.collection.Map("url" -> url,"driver" -> "mongodb.jdbc.MongoDriver","dbtable" ->dbtable,"user"->user,"password"->password)
-
Now create new SQLContext from your new Spark Context which has db2 driver loaded
val sqlContext = new SQLContext(sc)
-
Create a dataframereader from your SQLContext for your table
val dataFrameReader = sqlContext.read.format("jdbc").options(options)
-
Call the load method to create DataFrame for your table.
val tableDataFrame =
dataFrameReader
.load() -
Call show() method to display the table contents in the Notebook
tableDataFrame.show()
- Add unityJDBC jar which has mongodb driver.
- You have successfully created a dataframe from mongodb , now you can do further processing according to your need.