Data Engineering From A Data Scientist’s Perspective

We’ve had technical people focused on the ingestion and management of data for decades. But, only recently has data engineering become a critical, widespread role. Why is that? This post will outline a somewhat contrarian view as to why data engineering has become a critical function and how we might expect the role to evolve over time.

IIA expert Jesse Anderson recently wrote a nice piece discussing why data engineering and data science must be viewed as distinct skill sets and how organizations get into trouble when asking people to work outside of their core skills. That piece, along with some recent client discussions, led me to this post.

Database Administration, ETL, And Such

It wasn’t long ago that the primary roles focused on enterprise data were largely involved with three primary areas. First, there are those who manage raw data collection into source systems. Such systems often use some sort of custom data format that is far from user-friendly. Second, there are those focused on Extract, Transform, and Load (ETL) operations. ETL specialists extract data from a source system, perform whatever changes are needed to make it more user and analytics friendly, and then load that data into a repository intended for …

