Data Science Experience (DSX)

At DataWorks Summit this year, two giants of IT world IBM and Hortonworks announced their resolve to develop a tool that would harness enormous amount of data stashed in hadoop installations across different verticals to be utilised to predict insights into it (the data).

Though it was not a path breaking announcement as Data Scientists (or so called Data Charmers) were paid hefty amounts for this very reason.
Recently, IBM released DSX, a unified tool that addresses entire process of Data Science and Machine learning.
Why DSX?
“Data is the king and Queen” was the moto of every organization from the days of Industrial revolution. Companies had preserved this asset in Datawarehouses but often a big chunk of it remained untapped due to limited processing power at their disposal.
But almost a decade ago, with the advent of hadoop this has changed and now big amount of data can be processed using commodity hardware in reasonable time.

Earlier, Data Scientists build models using only a subset of data to learn patterns in it (small data, small learning) but using the compute provided by hadoop they can use big data to learn more accurate patterns in it (Big Data, Big Learning).So the combination of Data Science and Big data was inevitable that is why the leader in data analytics and Science,IBM and the prominet player of big data world, Hortonworks joined hands.
“Data is the new oil” is the moto of organisation(s) these days :):).
What is DSX ?
IBM Data Science Experience is an analytics environment that includes popular tools such as Jupyter Notebook, R studio on top of spark-as-a-service. Currently, Python, R, Spark is supported but efforts are going on to include Zeppelin popular tools among analytics community.
Using DSX one can abstract the complexities faced to intergate hetrogeneous tools to buid and deploy machine learning models. It has vibrant spark engine for processing the data, an object store to store computed models and seamless deployment of them from development into production.
It can also be used as a collaborating tool as different devlopres working on the same project can exchage ideas and code without any pain.Also,cognitive capabilities of the platform guides an inexperienced user to learn,design,develop and deploy ML models seamlessly.
Data Science EXperience is offered in three variants:
- DSX Cloud ,cloud based version
- DSX Local On-premise and
- DSX Desktop
Developers can develop on-premise and throw their code into production in cloud i.e develop scikit or R model locally using some data and score this model on some new data in the cloud.
Versioning capabilities built-in into DSX helps to put your model into continuous training loop i.e your model can adjust to the new incoming data and realign itself without hampring your end-users and performance.
Data store is not locked to one vendor.One can also use AWS S3 or some other cloud offering as well as MySQL.
Lot of community driven code(Jupyter Notebooks) and tutorials are also provided to master the craft of predicting.
How DSX ?
To learn more on DSX
1.Sign in for free trail(Limited Period)of DSX here .
2.Explore various tutorials and code books available at the dashboard (once you log in).
Disclimer: IBM and Hortonworks are registered trademarks of their respective owners.Images used in this post is sourced from Internet.