In March of 2021, I chose to leave the data team at Equinox Media and join a nascent open-source project lakeFS as the first developer advocate. In this post, I share a few reasons why I’m excited about starting this new chapter and the goals I hope to accomplish.
Your first instinct when altering a piece of code is to simply open it in an editor and make the desired changes. But somewhere along the way, somebody convinced you it is worthwhile to wait for a second and do one thing first…
> git checkout -b my-amazing-branch
It’s a small…
Data lakes offer tantalizing performance upside, which is a major reason for their high rate of adoption. Sometimes though, the promise of technological performance can overshadow an unpleasant developer experience.
This is troublesome since I believe the developer experience is as important, if not more, in proving the worth of technology or paradigm.
When creating and maintaining a complex system like a data lake, unfriendly user workflows and interfaces can sap productivity, similar to an application with too much tech debt or poor documentation.
One symptom of unfriendly workflows with a data lake is spending too much time in the…
Maybe you feel it in this moment, or maybe it’s a feeling you lost touch with lately due to circumstance.
The beauty of this passion — with it, you are capable of great things.
“The emotions you are feeling at this very moment are a gift, a guideline, a support system, a call to action. If you suppress your emotions and try to drive them out of your life… you’re squandering one of life’s most precious resources” — Tony Robbins, Awaken the Giant Within
One of the ways to keep the flame of your passion alive is to envision a…
If you work in data in 2021, the acronym ETL is everywhere.
Ask certain people what they do, and their whole response will be “ETL.” On LinkedIn, there are thousands of people with the title ETL Developer. It can be a noun, verb, adjective, and even a preposition. (Yes, a mouse can ETL a house.)
Standing for “Extract, Transform, and Load,” ETL refers to the general process of taking batches of data out of one database or application and loading it into another.
Data teams are the masters of ETL as they often have to stick their grubby fingers into…
Code that makes you grow. Code that motivates you to write great code. Code that demonstrates mastery of fundamental concepts. Code that reflects thoughtfulness and care by its creator. Code that inspires.
Code that blows the hair you have left, straight back.
Look, I get it. Projects have deadlines. Jira Tickets have point estimations. Getting something working in the quickest way possible is often a smart approach in a practical, corporate setting.
But if you find yourself always in that mindset, it’s a sign that you need to invest in yourself and grow. The lack of time you perceive for…
Instead, to start assuming the responsibilities of the job you want.
Someone who embodied this advice is Luigi de Guzman, a common Data Analyst when I joined the data team at Vroom in 2017.
Uncommon was the fact that if you spoke to anyone who also worked there, they’d tell you he was one of the most integral and respected employees in the whole company.
How did he do this?
If I had to boil it down to one sentence, I’d say it was a relentless attention to detail to how business processes worked.
To learn more about his mindset…
Say you are an awesome developer sitting contentedly at your desk when a Slack message suddenly interrupts your peaceful mental flow:
It would appear there is a data issue with the new Activity History service released last month… Or at least a couple people think there is.
Now, instead of making progress on new tasks, you now need to drop those and look into what’s happening here.
What this Activity History service does is calculate and then expose counts of how many times users have used the company’s application.
If you’re like me, you might think it would be cool to generate a transcription of an online meeting now and then. Maybe you are conducting an interview or simply had a great conversation with a co-worker you’d like to reference again.
Whatever the case, there’s a zero percent chance that I’m paying for a premium Zoom account or coughing up multiple dollars per minute of to transcribe a meeting. Not when I’ve spent the last half-decade of my life learning how to use the AWS platform!
Please tell me there’s a better way?
No longer will anyone suffer while setting up the process of doing a full export of a DynamoDB table to S3.
The same cannot be said, however, for someone looking to import data into a Dynamo table. Particularly a large amount of data and fast.
The need for quick bulk imports can occur when records in a table get corrupted, and the easiest way to fix them is do a full table drop-and-recreate. Or when streaming data into a table, it can be useful to run a nightly batch “true-up” job to correct any intra-day anomalies that may have occurred.
The following scenario truthfully occurred in my career, and was the moment the queue’s purpose in a system “clicked” and became clear in my mind. I hope that sharing this problem, the resulting struggle, and how I eventually solved it has a similar effect on you.
It’s a Friday afternoon at the office (remember those?) and you are getting ready for the weekend. You have already closed your laptop and are mingling with your co-workers to see where the happy hour spot is this week.
Suddenly, Brett from the marketing department approaches you — a respected senior engineer at the…
DevRel @lakeFS. Ex-ML Engineering Lead @Equinox. Whisperer of data and productivity wisdom. Standing on the shoulders of giants.