Sunday, August 31, 2014

Do You Know Big Data?

Vincent Granville published on Data Science Central a great poster about Big Data, originally posted on and provided by Altamira. You can download in PDF document

Thursday, July 3, 2014

Data Science, Big Data and Statistics – can we all live together?

A Tweet from Persontyle caught my attention about a great lecture by Terry Speed, a emeritus professor in statistics at University of California at Berkeley, published at Flowingdata. The lecture is part of the Chalmers Initiative Seminar on Big Data, that happened last March at Chalmers University of Technology. In the lecture, Terry Speed talks on how statisticians can play nice with big data and data science. He reports on some reflections on Big Data issues, offer some suggestions for statisticians, and summarize some theory some theory which, in his opinion, has relevance to the analysis of data, whoever does it. Worth Watching!

Vimeo link:

Saturday, May 31, 2014

10 Big Data Pros To Follow On Twitter

Last week, InformationWeek published an article with the 10 Big Data Pros To Follow On Twitter, written by Kevin Casey. I'm very honored to have been mentioned in the list.

According the article: "Twitter's kind of an ironic place to look for big data wisdom. It's an example of the ubiquitous services used by consumers and businesses alike that help generate this avalanche of data in the first place. Twitter has a valuable collection of big data knowledge -- if you know where to find it. Like other social platforms, Twitter can sometimes get noisy. Throw in a buzzword like "big data," and the noise can get downright cacophonous. So how do you find the information you want?"
He wrote a cool description on my Twitter's account: "Borba's active feed is a particularly good read if you're interested in how big data translates to bottom-line business -- in other words, making money. In addition to offering regular tweets on big data and analytics, Borba shares recommendations of other people to follow."

Monday, May 19, 2014

Dilbert on Big Data Brother

(Click on comic strip to view larger image)
Dilbert by Scott Adams-Dilbert ©2014, Universal Uclick
Published at May 11, 2014 in

Sunday, April 13, 2014

Top Big Data Executives and Experts to Follow on Twitter

A few days ago, the CEO World Magazine published the Top Big Data Executives and Experts to Follow on Twitter. I'm very honored to have been selected in the list at #8, with so great people.

According to Amarendra Bhushan, Editorial Director of CEO World Magazine: "To recognize and learn more about big data technologies and architecture, vendor developments in the big data and analytics industry, and numerous technical challenges. I have decided to compile a list of the Most Influential Voices in Big Data and Quantitative Analytics Arena. And what they think are their biggest challenges when implementing big data in their storage environments."

Tuesday, March 25, 2014

Celebrating 6 years of blogging

My blog is completing 6 years today. Thank you all for reading my blog for the last six years. As always, your readership is really appreciated.

5 Big Data myths debunked

Recently I read a great article entitled 5 Big Data myths debunkedwhere the author Sanne Steegstra, questions the main myths of Big Data clear and humorously. Below is the summary of the article:

Myth 1: Big Data is big
Nope. Big is relative. The ‘Big’ is marketing lingo, an attention grabber, as if we should automatically be afraid of scary unknown big things. Big actually means ‘more difficult to access and query than we are used to’. Yes, at some point you will probably need new tools. Most people and companies are not working with Big Data, just data.

Myth 2: Big Data is a technology thing
Wrong, it is a paradigm shift in business models. And up until the new models are a common thing, Big Data will keep its adjective ‘Big’. Don’t be fooled by the traditional soft- and hardware vendors that are the first ones to step up to sell you Big Data Solutions. This doesn’t mean it’s an IT thing, don’t confuse the message with the messenger. Big data is about acting smart, and right now about changing organizations to be smart, to have a competitive advantage with a better understanding and serving of customer needs.

Myth 3: There is an enormous shortage in analytical talent and experienced analysts.
A lot of companies are having a hard time recruiting the right analytically skilled people. The quest for analysts and data scientists is a hard one, with consultancy agencies acclaiming that message in advertorials everywhere. Although there can never be enough analytical talent, and yes there is a shortage, an important part of the problem lies somewhere else.

Your company is not interesting enough
Let’s be honest, your company’s corporate website might show pictures of young and good-looking people, working on an apparently fun business problem while pointing at a computer.. however in the non-Photostock reality you probably are a boring company that sells boring products and have boring problems to solve. So how on earth can you compete with interesting start-ups and cool, tech savvy companies? Well, just create interesting problems. Create multidisciplinary teams where analytical talent is not only (mis)used as support. Allow them to create the interesting problems.

You already hired them, they are working in the wrong department
And I bet it is the IT department.. All your STEM ( Science, Technology, Engineering, Mathematics) skills are in one place. Nicely hidden away on a separate floor, or in some sort of ‘incompany quarantine’. So besides a big mentality- and organizational change, there are two things you can do. You either teach your marketing staff analytical skills, or you teach your analytical staff marketing skills.

Data Analytics still is unnecessarily complex
A data analyst loves to analyze data, not the hardship of accessing the data. Are programming skills required to create cool tools, models and applications? Or are they an absolute necessity because otherwise 90% of the time would be filled by meeting with the IT department? Data analysts work on the frontiers of data. That means the data is by definition not structural, seldom relational and hardly quickly accessible. The company providing a plug and play like sandbox solution for all company data will leap an important part of the analytical gap.

Myth 4: Big Data is social data
Social is data’s super sexy showcase. It will often start with social data, not only because it (still) is up for grabs and there is a lot of it, but also because everything with a like button on it appeals to marketers, creative companies and more than the usual suspects (yep IT, BI and CRM, I mean you guys). So don’t forget to have a focus on your own ‘big' data. Got a central cash register system, a website with a large volume of clicks and views (and it’s up to you if you decide to label it large or ‘big’) or even better, have check-ins, sensor data or some other sort of activity generated data? Even better! That’s your big data!
Myth 5: Big Data is a hype
Stop the definition debate. Who cares. Everybody agrees on the possibilities and disruptive force data can have. The era of data has already begun.

Sunday, March 23, 2014

Kenneth Cukier on Big Data: The data revolution

Kenneth Cukier, the Data Editor of The Economist, gave a good interview about his new book and how our unprecedented access to data changes how we live and think. Watch and enjoy!

Thursday, March 20, 2014

Why smart statistics are the key to fighting crime

I watched at TED a good lecture on the use of analytics to fighting crime, by Anne Milgram. an american attorney. In the lecture, Anne Milgram explains that when she became the attorney general of New Jersey in 2007, she quickly discovered a few startling facts: not only did her team not really know who they were putting in jail, but they had no way of understanding if their decisions were actually making the public safer. And so began her ongoing, inspirational quest to bring data analytics and statistical analysis to the US criminal justice system.

Anne Milgram said that decided to focus on using data and analytics to help make the most critical decision in public safety, and that decision is the determination of whether, when someone has been arrested, whether they pose a risk to public safety and should be detained, or whether they don't pose a risk to public safety and should be released. Everything that happens in criminal cases comes out of this one decision. It impacts everything. It impacts sentencing. It impacts whether someone gets drug treatment. It impacts crime and violence.

So she went out and built a phenomenal team of data scientists and researchers and statisticians to build a universal risk assessment tool, so that every single judge in the United States of America can have an objective, scientific measure of risk. In the tool that they've built, what they did was they collected 1.5 million cases from all around the United States, from cities, from counties, from every single state in the country, the federal districts. Their goal, is that every single judge in the United States will use a data-driven risk tool within the next five years. She finished with a statement: "Some people call it data science. I call it moneyballing criminal justice."

TED link: Why smart statistics are the key to fighting crime