You are planning to complement your traditional data warehouse architecture with big data technologies. Now what? Should you upskill your existing data warehouse team? Or do Big Data technologies require a completely different set of skills?
What do we mean by big data technologies anyway? For the purpose of this article, I define big data as any distributed technology that is not a relational database. According to this definition, a distributed relational database (MPP) such as Redshift, Vertica, Teradata etc. is not a big data technology. I know. Other definitions for big data exist, e.g. any data volume that we can’t easily fit on a single server.
Tackling big data projects with a data warehouse mindset?
Your company already has a data warehouse built on traditional data warehouse architecture. Either a Kimball collection of conformed data marts or a Corporate Information Factory built on Inmon. Great! You already have a lot of experience running data projects successfully.
Let’s look at the various roles on your team and check if they are big data ready.
I presume you already run your data warehouse projects in an agile fashion. Not many changes there. It’s still scrum, user stories, spikes, and daily stand-ups. However, big data projects …