Tag Archives: getting-started

Prerequisites to get started with Big Data

Very often I do get the query ‘I am familiar with X, Y and Z, is it good enough to get started with Big Data ?’. Old and new construnuctions under Central Station Amsterdam from Flickr by urban portrait in photo pictures under CCThis post is to address the same. Here we will look at what is required to get started with Big Data and the rational behind it.

There is a lot of information on the internet to get started with each one of them and it is easy to get lost. So, the references to get started with those technologies are also included.

Linux : Big Data (the Hadoop revolution) started on the Linux OS. Even now most of the Big Data softwares are initially developed on Linux and porting to Windows has been an after UBUNTU! Tux floats to work! from Flick by danoz2k9 under CCthought. Microsoft partnered with Hortonworks to speed up the porting of the Big Data softwares to Windows.

To get started with the latest softwares around Big Data, knowledge of Linux is a must. The good thing about Linux is that it is free and it opens up a lot of opportunities. Would recommend to go through all the tutorials here, except the seventh one.

There are more than 100 different flavors of Linux and Ubuntu is one of the popular distribution to get started for those who are new to Linux.

Java : Most of the Big Data softwares are developed in Java. I say most, exceptions are Spark has been developed in Scala, Impala has been developed in C/C++ and so on.

To extend the Big Data softwares knowledge of Java is a must. Also, sometimes the documentation might not be up to mark and so it might be required to go through the The Evolution of Computer Programming Languages #C #Fortran #Java #Ruby from Flick by dullhunk under CCunderlying code for the Big Data software to see how some thing works or is not working as the way it is expected to.

For the above mentioned reasons knowledge of Java is must. Basics of core Java is enough, knowledge of enterprise Java is not required. Go through the Java Basics section and the Java Object Oriented section here.

Java programs can be developed with as simple as notepad. But, developing in an IDE like Eclipse makes it a piece of cake. Here is a nice tutorial on Eclipse.

Databases 2 by Tim Morgan from Flickr under CCSQL : Not everyone is comfortable with programming in Java and other languages. That’s the reason why SQL abstractions have been introduced on top of the different Big Data frameworks. Those who are from a database background can get very easily started with Big Data because of the SQL abstraction. Hive, Impala, Phoenix are few of such softwares.

Expertise in SQL is not required. The basics of the DDL and the DML operations is more than enough. Here are some nice tutorials to get started with SQL.

Others : The above mentioned skills are good enough to get started with Big Data. As one gets into more and more at Big Data, would also recommend to look at R, Python and Scala. Each of these languages have got their strength and weakness and depending upon the requirement the appropriate option can be picked to write Big Data programs.

To become good at Big Data it’s required for an aspirant to have a good overview of the different technologies and the above guide mentions what is required and where to start reading about them.

Best of luck !!!

Inventory for getting started with IOT

In the previous blog, I mentioned about getting started with IOT. I had been going through the starter guide and building some of the basic circuits using the sensors, actuators and the displays.arduino-uno-temperature-sensor Here is a circuit which displays the current temperature using the Arduino Uno Start Kit. It’s no magic, but it’s nice to built some thing and see it physically. All the time, it had been writing the invisible software code.

As an extension to the above circuit, an ethernet shield can be added to the Arduino Uno board and the data be sent to a NoSQL databases (maybe Mongo) in one of the Cloud using the REST interface. It’s a matter simply integrating the IOT with the Big Data and the Cloud.arduino-uno-temperature-sensor More on this in the coming blogs.

As mentioned in the previous blog the starter kits have all the components to avoid frequent trips to the electronic shops and also come with a nice manual with the basic how-to. Once the initial hurdles are crossed, imagination is the limit on what can be done.

While getting started with IOT, what surprised me was the concept of OSH (Open Source Hardware) similar to the OSS (Open Source Software). In the OSH model, all the physical artifacts are designed and offered in an open fashion. Anyone can download the design for a circuit board, improve on it and build clones of the physical artifact.

Arduino Uno microcontroller is one such component. Anyone can download the specs of the Arduino Uno, improvise on it and build clones of it. Unknowingly I did order a clone called Vilros Uno. It’s not a big deal, the Arduino Uno and the Vilros Uno are clones and have the same specifications.

Along with the Arduino Starter Kit, I did also order the Raspberry Pi 2 B (from element14) and the starter kit for the same. The Raspberry Pi 2 B has been announced recently and is relatively new. So, it will take a week or so before I get my hands on it.

While the Arduino Uno includes microcontroller, Raspberry Pi 2 B includes a raspberry-pi-2-computerSOC (System On Chip). By using the Raspberry Pi B 2, it should be possible to build a small and useful computer. The bare circuit towards the right of the monitor is the Raspberry Pi 2 B board which is connected to the monitor (through HDMI), keyboard (through USB), mouse (through USB) and the power supply. As of now, it can run Debian and Ubuntu Linux, down the line it should be possible to run Windows 10 also on it. It might not be as powerful as the latest processors in the market, but is decent enough to browse, play some games, work with Open Office, watch movies, do a bit of programming etc.

Here is the list of inventory, I did order till now to get started with IOT. I haven’t received all of them yet. Once I receive the complete inventory and start playing with them, I will write a review about each one of them.

To get started with the Raspberry Pi B 2

To get started with the Arduino Uno

Other components

If you have someone who is getting started with computers, then a Raspberry Pi 2 B is something really nice to get started with. More than 5 million of the Raspberry Pi family boards have been sold till now. And the interesting thing is that the Raspberry Pi is being introduced in schools for the kids to get started with computers.

Getting started with IOT

I started playing with circuits (555) since I was in 8th grade and  then lost focus on it. And then it was all about software and software, just keying on the keyboard. I was looking for some nice hobby to have some fun and looks like I am going back to the circuit world. I did my Engineering in Electrical & Electronics almost 17 years back and so it helped me to get started. Quite a lot changed since then, but a resistor is still a resistor.

Lately, I had been bugged by IOT (Internet Of Things) hype and started exploring the different aspects of it. Here is a good definition about what IOT is all about. It’s not just about hooking up a bunch of sensors and collecting the data, it’s all about making the human experience better.

To start with, I got a bunch of startup kits. These starter kits come with a microprocessor/micro controller, bunch of components (capacitors, resistors, diodes, sensors, switches etc) along with a nice book to get started. I would very much recommend the starter kits, so to as to avoid frequently visiting the electronic stores. Once comfortable with the starter kit, one can make an inventory of what is else is required to build something more advanced and get all of them in a single trip to the shop.Arduino Uno Blinking LightThe above is a Arduino Uno (it’s a micro controller) with a bread board and a flashing LED. By the time I was ordering a Raspberry Pi (micro processor), the Raspberry Pi foundation announced the Raspberry Pi 2 with double the memory and 4 times the cores at the same price as the original Pi. The foundation claims that applications runs 6 times faster in Pi 2 than the original Pi. So, I thought of being a bit patient and then order the Pi 2 which I would be getting in a couple of weeks. It should be possible to run Ubuntu Snappy and Windows 10 on the Pi 2.

In the coming blogs, I will be writing about the different components I bought and if I made a good/bad choice in buying them. It would take some time for me to get started, but I will also blog about some of the interesting things I start tinkering by integrating the three things I am interested in (Big Data – Cloud – IOT) .