What can we learn from high performers in other disciplines?

Photo by Evgeniya Litovchenko on Unsplash

You have a burning passion inside you.

Maybe you feel it in this moment, or maybe it’s a feeling you lost touch with lately due to circumstance.

The beauty of this passion — with it, you are capable of great things.

“The emotions you are feeling at this very moment are a gift, a guideline, a support system, a call to action. If you suppress your emotions and try to drive them out of your life… you’re squandering one of life’s most precious resources” — Tony Robbins, Awaken the Giant Within

Keeping the flame alive

One of the ways to keep the flame of your passion alive is to envision a…

If you’re as sick of this three-letter phrase as I am, you’ll be happy to know there is another way.

Take a Look Around You…

If you work in data in 2021, the acronym ETL is everywhere.

Ask certain people what they do, and their whole response will be “ETL.” On LinkedIn, there are thousands of people with the title ETL Developer. It can be a noun, verb, adjective, and even a preposition. (Yes, a mouse can ETL a house.)

Standing for “Extract, Transform, and Load,” ETL refers to the general process of taking batches of data out of one database or application and loading it into another.

Data teams are the masters of ETL as they often have to stick their grubby fingers into…

Can you consider yourself a great developer if you aren’t producing quality code?

Photo by Markus Spiske on Unsplash

You want to work with great code

Code that makes you grow. Code that motivates you to write great code. Code that demonstrates mastery of fundamental concepts. Code that reflects thoughtfulness and care by its creator. Code that inspires.

Code that blows the hair you have left, straight back.

But I don’t have time to write beautiful code…

Look, I get it. Projects have deadlines. Jira Tickets have point estimations. Getting something working in the quickest way possible is often a smart approach in a practical, corporate setting.

But if you find yourself always in that mindset, it’s a sign that you need to invest in yourself and grow. The lack of time you perceive for…

Learning why the Webster’s Director of Analytics doesn’t immediately fix problems, but takes the time to understand them first.

They say to not wait for a promotion.

Instead, to start assuming the responsibilities of the job you want.

Someone who embodied this advice is Luigi de Guzman, a common Data Analyst when I joined the data team at Vroom in 2017.

Uncommon was the fact that if you spoke to anyone who also worked there, they’d tell you he was one of the most integral and respected employees in the whole company.

How did he do this?

If I had to boil it down to one sentence, I’d say it was a relentless attention to detail to how business processes worked.

To learn more about his mindset…

Efficient debugging in AWS is something you must constantly strive for.

Setting the Scene

Say you are an awesome developer sitting contentedly at your desk when a Slack message suddenly interrupts your peaceful mental flow:

It would appear there is a data issue with the new Activity History service released last month… Or at least a couple people think there is.

Now, instead of making progress on new tasks, you now need to drop those and look into what’s happening here.


Setting up the Problem

What this Activity History service does is calculate and then expose counts of how many times users have used the company’s application.

If we’re Netflix, it’s how many episodes a user’s watched…

If you agree that “the faintest ink is more powerful than the strongest memory” then you should know how to cheaply spin up your own homegrown transcription service.

Photo by Jean-Louis Paulin on Unsplash

If you’re like me, you might think it would be cool to generate a transcription of an online meeting now and then. Maybe you are conducting an interview or simply had a great conversation with a co-worker you’d like to reference again.

Whatever the case, there’s a zero percent chance that I’m paying for a premium Zoom account or coughing up multiple dollars per minute of to transcribe a meeting. Not when I’ve spent the last half-decade of my life learning how to use the AWS platform!

Please tell me there’s a better way?

There is.

The better way is…

AWS released a new feature last week to export a full Dynamo table with a few clicks, but it’s also worth knowing how to hydrate a table with data at any scale.

Photo by Jeremy Bishop on Unsplash

No longer will anyone suffer while setting up the process of doing a full export of a DynamoDB table to S3.

The same cannot be said, however, for someone looking to import data into a Dynamo table. Particularly a large amount of data and fast.

The need for quick bulk imports can occur when records in a table get corrupted, and the easiest way to fix them is do a full table drop-and-recreate. Or when streaming data into a table, it can be useful to run a nightly batch “true-up” job to correct any intra-day anomalies that may have occurred.

Knowing when and how to deploy this simple and versatile tool can be an extremely powerful option in data system design.

The following scenario truthfully occurred in my career, and was the moment the queue’s purpose in a system “clicked” and became clear in my mind. I hope that sharing this problem, the resulting struggle, and how I eventually solved it has a similar effect on you.

Friday Afternoon

Photo by Tomasz Rynkiewicz on Unsplash

It’s a Friday afternoon at the office (remember those?) and you are getting ready for the weekend. You have already closed your laptop and are mingling with your co-workers to see where the happy hour spot is this week.

Suddenly, Brett from the marketing department approaches you — a respected senior engineer at the…

How we fix errors faster than two shakes of a lamb’s tail.

On the analytics team at Equinox Media, we invoke thousands of Lambda functions daily to perform a variety of data processing tasks. Examples range from the mundane shuffling of files around on S3, to the more stimulating generation of real-time fitness content recommendations on the Equinox+ app.

Because of our reliance on Lambda, it’s critical to diagnose issues as quickly as possible.

Here’s a diagram of the process we’ve set up to do so:

Serverless error handling architecture

If you are also a user of Lambda, what does your error alerting look like? If you find yourself struggling to figure out why a failure…

We’re talking Snowflake vs AWS, Data Lakehouses, and Data’s Disempowerment of Decision Makers.

Idea #1: Snowflake and Databricks’ Number 1 Competitor is… AWS

Peter Bailis (left, CEO of Sisu) and Ben Horowitz (right, Partner atAndreessan Horowitz)

The dynamic between AWS and companies like Databricks and Snowflake is one I’ve wondered about before. It is hard not to since AWS has its own products that are direct competitors to both Databricks (EMR) and Snowflake (Redshift). While at the same time both Databricks and Snowflake are both built largely atop the AWS cloud platform. Surely there must be tension there, made even more palpable by the recent stunning Snowflake IPO.

It was nice to hear Ben Horowitz, one of the most successful VC investors, give his thoughts here. Essentially, data warehouses and distributed compute environments are extremely difficult…

Paul Singman

ML Engineering Lead at Equinox. Whisperer of data and productivity wisdom. Standing on the shoulders of giants.

