HumanTalks Paris November 2024

12 Nov 2024

On November 12th, the second Tuesday of the month, the HumanTalks Paris Meetup took place. This time, I was fortunate to serve as both the host (because the event was held in Algolia), a speaker, and an organizer, as I had previously been a part of the organizing team when I lived in Paris.

This article serves as a recap of the talks from the session, offering insights for those who couldn’t attend.

Take Control of Your Time: Scripts to Streamline Routine

In the first talk, Paul Royer shared his experience with automating routine tasks in his workflow. He explained how he struggled with an inefficient deployment process at work. The existing method required several manual steps, such as making a pull request, waiting for it to be validated, and then manually updating a configuration file in a separate repository. This repetitive task took up valuable time for Paul and his team, prompting him to explore automation as a solution.

Paul initially considered using bash scripts for the automation but found the learning curve quite steep. Instead, he opted for JavaScript, a language he was already familiar with in his projects. By utilizing JavaScript, Paul was able to create a script to automate the task without introducing additional dependencies. This decision not only made it easier to implement but also simplified sharing the script with colleagues.

The next challenge Paul faced during the scripting process was accommodating different development environments. He realized that certain assumptions about the file structure that worked on his machine didn't hold true for his team members (maybe they had the secondary repository stored somewhere else, maybe they had a different copy of it, etc.). To address this, he programmed the script to ask the user for the necessary configuration the first time it ran, storing these settings for future use. This approach balanced automation with flexibility, ensuring the tool could adapt to individual setups without becoming overly complex. Aiming for simplicity is always the best option, IMO. And simplicity doesn’t mean everything has to feel magical: it’s ok to ask for user input if the input asked is clear.

Through this experience, Paul highlighted the importance of optimizing repetitive tasks wherever possible; the time saved will have a compound effect over time. For developers, automating routine aspects of work can help streamline processes and free up more time for more valuable tasks.

Dependency Management

The next talk was on the subject of dependency management, presented by Sulidi Maimaitiming. He discussed how to handle and update dependencies effectively and how to evaluate the right timing for such updates. As someone interested in technical debt and code maintenance, I was especially interested in the challenges of keeping a tech stack up-to-date in a fast-evolving environment.

He mentioned that his team, working in a large bank, had a cutting-edge tech stack in 2020, which by now is starting to show signs of being outdated. It was a powerful reminder that what is considered "state of the art" today will eventually become obsolete.

One of the key points Sulidi brought up was the trade-off between immediate and long-term benefits. He illustrated that with the eternal negotiation between developers who aim to prevent codebase decay and product teams pushing for new features.

His analogy made me think of offering a child a candy now versus ten candies later—they will likely choose immediate gratification. We often fall into the same cognitive biases in decision-making, potentially putting off essential updates in favor of quick gains.

He then introduced the concept of "cost of delay," which involves shifting the burden of proof by scheduling updates and asking stakeholders why a feature needs to be prioritized first. This approach helps weigh the cost of moving a feature's release against the benefits of staying updated.

While Sulidi's talk did not offer a definitive solution to the challenges of dependency management, it provided me with additional considerations and ideas to ponder, particularly the concept of the "cost of delay"..

First Rule of Code Club: It's Forbidden to Code Too Much

The third talk was about efficiency and efficacy. Ali El Amrani discussed the idea that constantly coding isn't always beneficial. His presentation was full of movie references, and maybe they diverted from the message because I had a hard following what he wanted to say.

Ali opened his talk with a great quote: "There’s no point in doing efficiently what shouldn’t be done at all." This set the tone, emphasizing that “doing the right thing” is more important than “doing the thing right”. He then argued that focusing solely on efficiency without considering efficacy will lead to wasted resources and dissatisfied clients.

I found it difficult to fully grasp Ali’s message, as his talk felt confusing at times, with examples that didn’t clearly align with the definitions he provided. What I did understand left me somewhat dissatisfied, as it seemed to carry an elitist undertone, suggesting a divide between "good" and "bad" developers—the good ones being those who dedicate themselves to “building a cathedral”.

I myself take pride (and joy) in practicing what I consider software craftsmanship, but I also believe it’s not the only path to doing meaningful and valuable work. Ultimately, I did not really connect with that talk, so I can’t say I’m making Ali’s message justice here.

Successful Slides: Essentials to Know

And I finally myself took the stage to talk about a topic close to my heart: how to deliver successful presentations. As someone who frequently conducts internal training at Algolia, I focus on helping colleagues express their knowledge confidently on stage. My goal is to provide practical tips that work for most people, especially those not accustomed to speaking in front of an audience.

One key point I emphasize is the importance of using slides. While skilled speakers might pull off talks without slides, I believe that for most of us, slides are essential in maintaining the audience’s attention. They serve as a secondary channel to convey our message, complementing what we say with visual aids. This dual-channel approach helps the audience grasp and retain the content more effectively. However, it's crucial to remember that slides should support, not overshadow, what you’re saying.

I brought attention to what I consider the Golden Rule for creating effective slides: one idea per slide. It’s really important to remember that the value of a slide is not in the amount of information it contains, but in its clarity to convey your message.

Having one idea per slide not only aids the audience’s understanding but also helps you as the presenter. It acts as a cheat sheet, guiding your speech and highlighting the main points you want to cover.

Conclusion

I had a few interesting discussions with attendees after the talk (discussing public speaking, the role of a Dev Advocate and Django/Rails differences).

Hosting this meetup at Algolia was a once again a great experience—we love welcoming events like this to our space. If you’ve read this far, I hope to see you at a future event we organize, and if you’re a meetup organizer yourself, don’t hesitate to reach out! We have an amazing venue and would be happy to host your next gathering.

dotAI 2024 - Day One

07 Nov 2024

Last week I attended the dotAI conference. You'll find below a talk-by-talk recap of all talks of the first day. I'm no data scientist myself, so many details went way over my head, but I learned a lot, and had a very good time there.

Stanislas Polu, LLMs reasoning and agentic capabilities over time

The first talk was by Stanislas Polu, which explored the journey of AI's evolution. He mapped out the significant milestones and discussed their potential impact on the future.

Stanislas started with a graph categorizing AI tasks into three stages. Initially, AI focused on mimicking human perception—recognizing speech and identifying objects like cats or dogs. As it developed, AI began tackling more complex tasks, like writing text and extracting structured data from chaos.

His conclusion was intriguing: AI might soon be able to think. Currently, AI generates the next piece of information based on the previous one, lacking foresight. The future could bring an AI that reasons and provides solutions after some consideration.

He also shared insights from his time at OpenAI. Despite its efficiency in performing human tasks, initial adoption was low. In 2022, OpenAI shifted its focus to product accessibility by introducing a chat interface. This change aimed to reduce barriers and encourage broader use of their technology, paving the way for further innovation.

Stanislas touched on specialized models, like those that excel in games such as chess and Go. While models like GPT-3 and GPT-4 show improvements, they require significant GPU resources for minimal gains. We've reached a point where data saturation limits performance improvements, even with additional computational power.

Looking ahead, Stanislas sees the next step beyond large language models: machines that can think. He mentioned his company, Dust (https://dust.tt/), emphasizing that the current challenge lies in product development. For AI to progress, its adoption within companies needs to increase.

To support this, he suggested creating pipelines that connect various agents on platforms like Confluence, Slack, or Spendesk. His talk explored these concepts and opened up discussions about future possibilities.

Ines Montani, Reality is not an end-to-end prediction problem: Applied NLP in the age of Generative AI

The second talk of the day was presented by Ines Montani from Spacy, Explosion, and another company I can't recall. Her talk was vibrant and engaging, filled with memorable catchphrases and punchlines. She discussed the world of software 2.0.

In software 1.0, you write code, compile it, and get a functional program. To improve it, you refactor the code for better results. Modifying the compiler itself is generally not the approach to achieving more specific outputs.

In the software 2.0 realm, you feed data into an algorithm to get a model. Improving this model involves optimizing and refactoring the data. Just like with a compiler, focusing on model optimization hits a plateau, whereas most progress can be made by refining the data.

Her talk emphasized the ability to refactor data, highlighting the importance of quality over quantity.

Ines doesn't believe in a universal model that can do everything. She advocates for specific models tailored to individual tasks for optimal performance. With reinforced learning, applications can learn from examples to execute tasks correctly. The quality of their output is more dependent on input quality than on the model's quality.

She noted that this new paradigm is challenging for developers to embrace because it involves working with a "black box" (the model). However, she explained that machine learning can be modularized into small components that interlock.

This modularization has all the benefits known in software engineering: scalability of individual elements, respect for data, the ability to run components locally, or test them individually, etc.

Ines shared a metaphor that I found very illustrative. She explained that in early 20th century England, people often couldn't afford alarm clocks. So, they hired someone to knock on their windows at the desired time. If someone back then had access to machines to replace human work, they might have invented a device to walk through the streets knocking on windows automatically, making it the vision of "progress."

Today, with AI, we're in a similar situation. Instead of merely replacing human tasks with AI, true innovation lies in reinventing how we accomplish these tasks, akin to the invention of the alarm clock back then.

That's the essence of Ines's metaphor: beware of creating modern "window-knocking machines" when more elegant and autonomous solutions exist. She gave the example of shared calendars. If you're trying to find a common time slot with someone in another time zone, you might imagine an AI accessing both calendars and acting as a chatbot to find a mutual slot. It's a practical but limited solution.

True innovation, according to her, is the approach of tools like Calendly. By exposing available time slots, the tool allows everyone to directly choose a convenient time, simplifying the task without complex AI intervention, offering a more natural and seamless experience.

Her talk demonstrated that AI can help us enhance what we're already doing without replacing everything. It's important to continue splitting and modularizing our systems rather than merging everything into a large monolith.

Merve Noyan, Gain full control of your apps with Open-source AI

In the latter part of the morning, after the first break, we attended a talk by Merve Noyan from Hugging Face. This is when I started noticing a recurring pattern in the presentations: many speakers seemed to highlight their companies as much as, if not more than, the underlying topics. Some did it more subtly than others, but it often felt like a parade of product placements, with each company showcasing what they do.

It's a classic at most conferences—a necessary evil we've come to accept. That said, the talk from Hugging Face was far from the worst, and their open-source orientation conveyed an interesting message of cooperation. She managed to capture my attention and left me with a positive impression.

For an outsider like me, Hugging Face is starting to resemble the GitHub of Machine Learning and AI. There are numerous models available that you can try directly with the API, build with a Docker command, fine-tune, quantize, and do many other things with terms I'm not entirely familiar with yet.

The speaker illustrated Hugging Face's role in the AI ecosystem, highlighting its platform where developers can publish models publicly, enabling others to test, use, or improve them through fine-tuning and other techniques.

She emphasized the importance of collaboration to enhance the overall quality of models, showing that, much like in software engineering, the power of collaboration and open-source creates the state of the art.

In this spirit, Hugging Face provides numerous open-source tools, facilitating the improvement of existing models. For instance, she mentioned Whisper, whose first version was improved thanks to a community fork on Hugging Face, so much so that the second official version is actually based on this fork. This approach demonstrates the power of community in the AI field, much like what we've seen with open source in software development.

Her final message was that open source will always win. Her argument was that historically, all proprietary models are now being caught up with by open-source models with similar capabilities. There's a plateau where it's no longer possible to add more GPUs for additional improvements. Even with a lot of money in a big company, the GPU capacity for training won't yield significantly better results. However, optimizing the model at the data input level or through parameter distillation makes it more efficient, lighter, and faster. This can be achieved through the community to which they provide plenty of SDKs.

There was a very interesting aspect of what Hugging Face is doing at this level. I was puzzled at times by certain terms used, like "open source," which seems to have a different definition in the AI field than what I know in software engineering.

My personal reflection after working in a large company is that open source allows for incredible innovation. We can use libraries created by others to improve something without having to build everything ourselves. Large companies have tremendous inertia; they can put a lot of GPUs behind their projects, but the return on investment eventually stagnates.

The only thing of real value is collaborative improvement. With open-source models, this improvement must be tested by real users and their data to analyze performance and results. You can create the best model theoretically behind four walls, but it is imperative to have it tested by real people, and in that sense, a public development model like Hugging Face offers is an incredible resource.

Sri Satish Ambati, Democratizing AI with Open Source Multi-Agent Systems: Advancing the Future of Workflow Automation

Later in the morning, there was a talk by Sri Satish Ambati.

Honestly, I must admit I didn't retain much. The presentation was quite hard to follow: he spoke continuously, stringing words together without real pauses, and the slides were not at all in sync with what he was saying.

He jumped from one idea to another without apparent transitions, and he changed slides mid-sentence, scrolling through his list of slides without ever stopping, making comprehension even more difficult.

In the end, he was even interrupted by the moderator for running out of time, adding that he could have said much more. Unfortunately, I have no clear memory of his message, and it left me with mixed feelings about H2O, still unsure of what they actually do.

Lightning Talks ⚡

Early in the afternoon, after a good lunch and a good coffee, we got back in the room for some Lightning Talks (which were quite long, about 10 minutes each).

The first lightning talk was presented by Daniel Phiri from Weaviate, a vector database. Daniel was one of the first to introduce the concept of multimodality, a theme that resurfaced frequently throughout the event. His metaphor particularly stuck with me: he compared our interaction with AI to human perception, which involves five senses. Currently, when we interact with AI, it’s often in a one-dimensional way, like through text or voice, but rarely by combining multiple modes of communication.

Daniel explained that to move towards a more natural and human interaction, AIs need to incorporate this multimodality, akin to preparing a complete dish rather than just offering the smell, taste, or touch separately. He argued that to truly advance, AI must integrate all these aspects coherently.

He also mentioned that a lot of effort is still focused on what he calls the "plumbing" of AI, which involves connecting different systems so that information flows correctly. For example, transforming generated text into a JSON structure, or ensuring that various types of data can coexist in a common vector space. We’re still in the early stages, but Daniel sees this multimodal integration as a crucial step for AI's future, where each "sense" generated must work together smoothly and naturally.

Next, Audrey Roy Greenfeld gave an excellent talk on how they've used machine learning to generate fake news, like on The Onion. It was a concrete example: they had fun playing with a model to generate sarcastic fake news. They realized it works. And it works so well that they spend their time reading the generated news to validate them.

Since they then became the bottleneck in determining which ones are funny and which aren't, they automated the process. How do you get the tool to self-analyze the quality of the jokes? The problem is that it now starts making jokes in its analyses. If it's the same tool fine-tuned to make jokes and to assess if the jokes are funny, it keeps making jokes in its analysis. So, they had to use two different elements.

To ensure diverse perspectives, they generated a galaxy of fictitious characters, who would be the fictional authors of their fictional articles. By varying the characters, even if they are randomly generated (they had 300 different ones), it allows for different jokes that don’t all sound the same. Even if they couldn't revisit exactly the characteristics of the character in question, it produces different outputs. Excellent talk.

Then, Maxim Zaks talked about a programming language called Mojo. He made allegories with Modern Family and showed that Mojo is a new language that simplifies things. It’s not really an overlay on Python, but it’s interesting because Python is the default language for many things in Machine Learning, and it uses similar paradigms to be easy to learn. Mojo draws from Python’s syntax to make simple things easy and complex things possible.

Afterward, Grigory Dudnik gave a talk that I found very difficult to follow on how to create practical applications and help developers write better code. He addressed the challenges to resolve for tools like GitHub Copilot or Cursor to work, but I struggled to follow and couldn’t summarize his points.

Steeve Morin, ZML: ML framework for Zig

After the lightning talks, there was a presentation by Steeve Morin, who introduced us to ZML. It's a build pipeline for creating models and running them on any platform, not just NVIDIA. I wasn't too aware of the rivalries between different players, but Steeve seemed to be on a crusade specifically against NVIDIA. ZML is a proposal to develop without being tied to a specific platform.

ZML utilizes four existing technologies within the ecosystem. There's a programming language named Z, similar to Mojo, for writing models. Bazel is used as the build platform. There are two other components whose names I didn't catch, but they enable cross-compilation for different platforms.

Their goal is to simplify the developer experience with command-line tools. These tools allow for building across multiple platforms and producing a single binary that's compatible everywhere. No need for multiple binaries; one is enough.

In conversations with others, it seems widely acknowledged that NVIDIA holds a dominant position and that alternatives are needed. Steeve advocates for a community-driven and open-source approach by using existing tools and giving people the freedom to choose their own providers.

ZML offers cost flexibility by allowing users to switch providers based on needs and budget, facilitating the use of different providers and ensuring ease of transition.

So, ZML seems promising. The questions posed after his talk show a genuine interest in open source and the fight against NVIDIA's dominance. Steeve aims to ease the creation and execution of code on various platforms with an optimized developer experience, which I greatly appreciate.

Yann Léger, Serverless Inferencing: an infrastructure point of view

The last talk before the break was on inference, presented by Yann Léger. I initially struggled to understand this talk because I wasn't familiar with what inference meant. I finally grasped that inference is what happens at the end of the chain; once a model is trained, inference is when you query it.

What's important during inference is speed, which should be roughly equal to reading speed, as it's an input used by a human.

Yann has built a system that allows inference at the edge, meaning as close as possible to the end-user. You deploy your model there, which enables scaling inference nodes in specific areas.

In this talk, I again felt that speakers often focus on presenting their specific solutions. They address a problem—here, inference—and how they solve it, but instead of providing practical advice applicable to everyone, they mainly describe how they handle it within their own company. This gives the impression of proprietary product placement rather than a genuine session of sharing best practices.

The talks that stood out most to me during the day were those that avoided the "product placement" format and had a long-term perspective. All those with a long-term and sustainable vision advocated for open-source. In contrast, those focused on short-term plumbing solutions were affiliated with a company.

This talk raised an important issue, but I would have appreciated more depth on the problem itself and long-term solutions to address it. The approach was too centered on "how we solved this problem, trust us, and use our service so you don't have to think about it." The truth is, I have no guarantee that this company will still be around in five years, whereas I know the problem will remain. To move forward, I would have preferred to see us find solutions and best practices together, rather than rely on a commercial solution that masks and abstracts the problem.

Pierre Stock, The future of Edge Agents

The next talk, from Mistral, was one of my favorites of the day. It focused on LLM at the edge, in embedded systems, when there’s no internet connection available and it needs to run on users' phones or computers.

What I found interesting was that it was a very specific topic, and the speaker, Pierre Stock, explained it very well.

Unlike other talks that didn’t always have clear topics, this one did. Coming from Mistral, I appreciated their ability to explain complex elements simply, demonstrating a deep understanding. They discussed the number of parameters in a model, measured in billions. These parameters take up a lot of space and must fit into limited RAM.

To achieve this, they use a technique called quantization. This involves reducing the number of parameters by approximating them to values between 1 and 255, which reduces size while maintaining some quality of response. But that’s not enough. A second phase of quantization further reduces complexity and quality by assigning values between 1 and 16 to the parameters.

To minimize information loss by reducing scale, they train their model to be resistant to the noise created by this quantization.

I loved the explanation because it was clear, like sharing a tip without using complicated words, and without claiming it was a "secret sauce." They also talked about caching in the KV (Key Value Store) to generate tokens more efficiently. Honestly, that part went over my head (especially the sliding windows part), but I still gathered that they knew what they were doing, and their approach was very transparent and smart.

The talk was very interesting because the speaker explained complex concepts simply, showing his mastery of the subject. He gave concrete examples and answered questions honestly. I liked their fun and modest approach.

In the Q&A, he mentioned that while we’re making a lot of progress in AI, adding black boxes creates problems for the future. He prefers to resolve existing black boxes before tackling new problems. I really like this philosophy because it emphasizes understanding and transparency in systems.

As a developer, I agree with the idea that if I can’t explain something, it means I don’t understand it well enough to improve or debug it properly. This talk really resonated with me and aligned with my own concepts and experiences in development.

Dr Laure Seugé & Arthur Talpaert, More empathy and health data protection in AI: announcing a primary care revolution.

The next presentation was from Doctolib. While I thought the one from H2O was the worst of the day, Doctolib's presentation set a new low. I found it extremely awkward and out of place. It was truly uncomfortable, and I almost considered leaving the room.

Perhaps it's my background as a public speaking coach talking here, but I noticed many red flags—things I usually advise against in presentations.

Two Speakers: The presentation began with two speakers, a delicate exercise that only works if they offer different perspectives and are used to working together. Here, one was a doctor and the other a technician. This could have been interesting, but the lack of natural interaction felt like poorly executed acting. The presentation might have been more effective if delivered by just one of them, especially since the tech role wasn't particularly technical.

Not a TED Talk: They probably aimed too high with TED Talk-style phrases, starting with "Imagine a world where..." without even saying hello. Their intention to inspire fell flat and created an awkward atmosphere.

No Introduction: They failed to introduce their use case, assuming everyone knew Doctolib. While this might be true for a French audience, presenting to a European audience required more humility. Explaining their use case would have helped everyone understand the context.

Product Announcement: Their presentation felt more like a product announcement, focusing on a new feature without detailing the technology's workings. In a technical setting, more information on the underlying technologies and challenges was expected.

Video: They used a video in a relatively short talk. In 18 minutes, dedicating 5 to 6 minutes to a video seemed excessive. A shorter video would have sufficed, as the main idea was clear from the start. Additionally, showing the video in French to an English-speaking audience seemed like a lack of adaptation.

Poor Slides: They had few slides, making it hard to follow and limiting understanding. One slide on data transmission was too confusing, needing breakdown into parts. Other slides were disconnected from their speech, diluting their message.

I must admit my critique might be a bit emotional, as I've worked at Doctolib and helped colleagues prepare for public speaking. My focus is not to criticize them harshly but to see them do better. They had content that could have captivated the audience, but it was undermined by the presentation format. It was a missed opportunity for an engaging presentation.

Despite the presentation's weaknesses, discussing with others during the break, what stood out was the speakers' genuine enthusiasm. They were visibly proud and happy to present their work. Even though the presentation had its flaws, you couldn't help but feel happy for them, seeing them so passionate about their success.

Romain Huet & Katia Gil Guzman, Building with OpenAI: What's Ahead

At the end of the first day, there was a conference from OpenAI. I expected it to be presented by Romain Huet, whom I've met in the French startup ecosystem. He was previously with Stripe, and I was looking forward to seeing him again. I remember being surprised when I watched the OpenAI keynote and saw Romain take the stage. I had no idea he worked there. It felt like two worlds collided for a moment.

Ultimately, they made the call from San Francisco. Romain did an introduction, and Katia, an OpenAI developer advocate, conducted the demo on stage. For a while, I wondered if Romain was an AI-generated or if the call was pre-recorded. There were few interactions between them, which is understandable given the challenge of making an engaging talk with someone on the other side of the planet.

Romain made remarks about some things not going as planned in the live demos. We all know live demos don't always go as smoothly as rehearsals. Despite this, it was a fascinating talk. As a developer advocate, I must commend Katia's work. She was incredible on stage, perfectly at ease and completely mastering her subject. She answered complex questions and performed a live demo without stress, even when everything didn't work perfectly.

Katia showcased OpenAI's advanced voice capabilities, explaining how you can have a live conversation with the agent. Currently, when using ChatGPT, you talk to your phone or another system. The Whisper model transforms your speech into text, which is then sent to GPT for a solution. GPT responds in text, and this text is converted to audio so you can listen to the answer.

It's somewhat like a walkie-talkie conversation where each party must wait for the other to finish speaking before responding. This slows the process because multiple transformations are required, and information can be lost along the way.

They also discussed multimodality, allowing communication with images and sound simultaneously. The demo was impressive: they showed how to navigate a solar system by voice using a 3D rendering application in the browser.

They indicated that this feature is available in the United States and will soon be in Europe. It was an excellent demo to conclude the conference.

Katia faced interesting questions, including about generating different voices and potential identity theft risks. Her response was clear: just like with email, there will always be people who misuse new technologies for malicious purposes, but that shouldn't stop us from using them.

It wasn't my favorite talk because the information presented was already known to me from other OpenAI conferences and product announcements. However, it was very impressive and well executed. It was a good way to end the day with something polished and impressive.

There was nothing to fault in this talk: it was exactly what I expected, no more, no less, with excellent quality in both execution and presentation.

Conclusion

Attending an AI conference often leaves one unsure of where the content will fall—between surface-level buzzword bingo or highly technical discussions that at least 5 PhD to understand.

Here, the talks struck a balance, aimed at a tech-savvy audience building AI tools and tackling shared challenges. Some sessions dove into unique use cases and specific issues, which I found fascinating and full of insights. Writing this summary took time, but it's only the start. With a second day of talks ahead, I’ll be sharing more soon, and I encourage everyone to check out the upcoming content.

We Love Speed 2024

17 Sep 2024

Yesterday, I attended the WeLoveSpeed conference here in Nantes, a conference I’ve been attending for years. It’s one of my favorite conferences because it dives into something I’ve always cared about: web performance. There were two tracks with talks in both French and English, and I tried to attend as many as I could.

Even though I’m not as hands-on with web performance as I was earlier in my career, it still interests me—everything from optimizing network connections to tweaking JavaScript, CSS, and images. Over time, the field has evolved, especially since Google rolled out Core Web Vitals and made them a ranking factor. That pushed web performance into the mainstream, beyond just a tech niche as it was initially.

What I find wonderful about performance is that it’s one area you can always improve, and no one will ever say "This is too fast!". While on other areas like if you push a new feature or a redesign, you'll always have people complaining (subjectively) that they don't like it. Speed seems to be one of the only elements that everybody agrees on.

In the rest of this post, I’ll share recaps of the talks I saw. As a public speaking coach myself, I value effective storytelling and visuals that drive home the message. If I can explain the content of a talk to someone else after attending it, I know it’s stuck with me.

Some talks didn’t teach me as much, either because I was already familiar with the content or because the tips weren’t relevant to my work. None of the talks were bad, but some definitely stood out more than others.

AB Testing

The first talk of the day was by the team at Fasterize, and they tackled a big question: Does improving web performance really lead to more revenue? We've all heard the claim that speeding up your site by 100 milliseconds boosts conversion rates by 1%. For a decade, people have repeatedly cited this statistic, but is it really accurate?

Fasterize investigated this with real-world testing on their own customers. They provide their clients with an A-B test feature: they compare two versions of a site—one optimized for speed and one that isn't. It's important that the tests run at the same time to avoid things like seasonal sales skewing the results. This way, they can measure the actual impact on conversion rates, average order value, revenue, or whatever else is important to you.

An important part of this is understanding correlation vs. causation. It’s like saying ice cream sales and shark attacks both rise in the summer—it doesn’t mean eating ice cream causes shark bites, it just means more people go to the beach in summer.

They also mentioned that it takes a while to get clear results. For instance, if a user begins their shopping on Monday, you initiates the A-B test on Tuesday, and the customer doesn't make any purchases until Thursday, the data becomes muddled between the old and new versions of the site. You need to be patient and wait for those in-progress purchase funnels to end, before you can see any actual results.

Another issue is that people frequently switch devices for the same cart. People may begin their shopping journey on their phone in the morning and conclude it on their desktop in the evening. Because they are technically two different devices, there is a risk they won't be attributed to the same A-B test group, skewing the results. Those cases need to be filtered out of the data.

Some trends take longer to appear than others. For example, Core Web Vitals can be gathered in a matter of weeks, but behavioral impacts (e.g., people adding an item to their cart) and business impacts (e.g., increase in revenue) will require months. It is also pretty useless to compare data between two given days; what is of interest is the overall statistical trend.

They stressed that live testing with real users is the only way to represent an accurate outcome. We can use synthetic tests as a guide, but they cannot accurately represent complex human behaviors. There is no way to predict what real user will do, we can only guess. Real user data gives us a true picture of how performance changes affect revenue.

When asked how to prove performance improvements have an impact without their A-B tool, Fasterize admitted that it’s tricky. But after 10 years of working with customers, they are convinced that improving performance has a positive impact (or, worst-case scenario, a neutral one) on revenue.

One key takeaway was that no single Core Web Vital will guarantee a revenue boost; it’s a mix of factors that work together. CLS is not more important than FID or LCP, per se.

It was a good talk, and it crystallized some thoughts I had on A-B tests with clear and concrete explanations.

Leboncoin

In the next talk, the Leboncoin technical team shared how they handle web performance in their company. What really stood out was that they're not only tech experts, but they also know the business inside out, with 18 years of cumulative experience. This gives them a unique perspective on what metrics really matter for their specific needs.

One big focus was on Core Web Vitals (CWV), which are key indicators of web performance, but they made it clear that not all of them are equally important depending on the context. They explained why certain metrics, like Content Layout Shift (CLS), are a bigger deal for Leboncoin than others because of how their website works.

Ads, which appear within search results or in banner-like frames around the main content, are a major source of revenue for Leboncoin. Ads are crucial for revenue, but they can sometimes cause layout shifts that frustrate users. For example, a user might try to click on a category, but an ad suddenly appears, causing them to accidentally click on it instead. That's why they place a high priority on preventing layout shifts.

Maybe for other businesses and other contexts, LCP (Largest Contentful Paint) would be a more important metric to tackle. It all depends on context. The main point is that, even if all CWV are important in theory, you can probably still vaguely rate them in order of importance based on your specific needs.

Back to the layout shift. They’ve tackled this by pre-assigning heights to ad spaces, so even if the content changes, the layout doesn’t shift. This reduces the chance of accidental clicks. But fixing this led to another problem—ad partners saw a drop in conversion rates, thinking there was something wrong with the site. In reality, it was just fewer users clicking ads by accident! This example shows how tricky it can be to balance user experience with business needs and why it’s essential to educate teams internally.

Their approach goes beyond just tech fixes. They emphasize making these metrics simple to understand for everyone in the company. They’ve created tools and dashboards, visible to all developers, to show the impact of different metrics. They also train developers to interpret these graphs properly. For instance, a spike in 404 errors could not necessarily indicate a bug. Similar increases in 200 responses at the same time simply indicate a general traffic increase. Learning how to read data in context is key.

They also shared how they monitor performance using Lighthouse CI, which integrates directly into GitHub. If a Pull Request significantly reduces the CWVs, it cannot be merged until they fix the drop. While this might seem like overkill for a small drop, it’s much better than discovering an issue after it hits production and causes major problems.

The talk also stressed the importance of shared knowledge and understanding across teams. Educating teams—whether it's developers, marketing, or SEO—is critical. Teams working on ads might have different goals from those optimizing for SEO, but it’s important that they all understand why certain metrics matter and how they affect the business. In that regard, they integrated graphs of CWV directly in the tools the SEO team was using (Search Console), so they wouldn't have to go to another tool (and likely forget about it) and instead have it directly in the tool they are the most used to using.

At the end of the day, the key takeaway is that while the specifics of which CWV matter most will differ from company to company, the methodology remains the same: figure out what’s important to your business, make the data accessible and understandable, and keep everyone aligned. Tech is only part of the solution—knowing your company and ensuring all teams are working toward the same goals is the hardest part.

Tight mode and 2-steps waterfall

Another amazing talk was also probably the funniest of the day, with the speaker progressively dressing up as a clown on stage as the behavior of the browser became more and more erratic. He dove into how browsers load assets like CSS, JavaScript, and images, and how they decide what’s most important to load first.

He introduced the concept of "tight mode." This is when browsers load assets in two phases, tight mode being the first of the two phases. The lack of a standard makes it poorly documented and handled differently by each browser. The reason tight mode exists is that many web servers (Nginx, but especially Node.js) don’t handle multiplexing correctly in HTTP/2. This means they don’t always send assets back in the right order of priority, but instead they are all mixed up. So, to mitigate it, browsers had to come up with their own tricks.

He epxlained a variety of browser handling methods, serving as a guide for this undocumented behavior. For example, Firefox sticks to the official specification, assuming the server does everything right (which it rarely does). Meanwhile, Chrome and Safari have their own ways of empirically guessing what’s important and loading those assets first.

The main principle of "tight mode" is that everything important in the head should be fetched before we attempt to fetch anything from the body. We first download JavaScript from the head (I think with a limit of 2 requests in parallel), and only then proceed to download the body. Chrome takes a more aggressive approach, attempting to load the first five images from the body as well while simultaneously downloading elements from the head. This approach is based on the possibility that one of the primary images might be causing a layout shift and might be among the first five images, so it's better to preemptively download it. Safari, on the other hand, doesn't implement this but has a different behavior when it comes to scripts in the head or marked as async (or something similar; I don't exactly remember the specifics).

The big takeaway? The same webpage will load differently depending on which browser is viewing it. The waterfall you get when loading a site varies so much between browsers that trying to optimize it for one could end up hurting performance in another. The speaker summed it up in a humorous but kind of depressing way—there’s no perfect solution, and every browser does its own thing. Trying to optimize it yourself will probably do more harm than good, so for now... well, it's the way it is.

Despite the complex and somewhat frustrating topic, the speaker made it really entertaining. It was a fantastic talk, both in terms of the content and the delivery.

Font best practices

I also watched a really clever presentation about fonts, playing on the French word police, which means both "font" and "the police." Throughout the talk, the speaker used humorous images with police officers to explain how fonts work on the web, giving lots of real-world examples.

He provided several valuable tips, such as subsetting fonts, which involves eliminating any extraneous characters (glyphs) and loading only the necessary characters for the selected language. He also talked about choosing fallback fonts that are the same size as the main font to avoid layout shifts when the fonts swap. Another smart idea he shared was to use dynamic fonts, so you don't have to load separate files for bold or italic versions.

One more practical tip: load your fonts from the same server as your main site. This avoids extra DNS lookups and SSL handshakes, which can slow things down.

He packed his talk with helpful tools and advice on optimizing fonts, all with a humorous police theme running through it. It was a super informative and specific talk, perfect for anyone wanting to boost font performance. He kept it short but impactful!

Perception of time

The focus of this presentation (slides) was on how our brains subjectively perceive waiting time, whereas most of the other talks were about objectively calculating it through tech tools. The idea is that whether something feels fast or slow is extremely subjective and depends on a lot of external factors, like our age, our sex, our heart rate, and many other things that could be happening at the same time around us.

The speaker aimed to highlight how these factors can influence our perception of time. She used various examples to illustrate this concept. One was about an airport where passengers had to wait 15 minutes for their luggage and complained it was too long. To address this, the airport hired more staff to bring the bags faster, reducing the wait time to 7 minutes. However, passengers still found this too long. Instead of hiring more staff, the airport made the plane land at the opposite end of the airport. Passengers now had to walk 7 minutes to reach the baggage claim, which made the wait seem shorter because they had been busy walking for 7 minutes to get there and it didn't feel like they were actually waiting in line. Complaints dropped after that.

She also talked about the common placement of mirrors in elevators and queuing areas. Mirrors keep people engaged because they allow them to see themselves, which can make the wait feel shorter.

Another point she made was about the effect of heart rate on time perception. A faster heart rate can make time seem to pass more quickly, while a calmer heart rate can make it seem to pass more slowly.

As people age, their perception of time can speed up because they experience fewer new things. For children, each day is filled with new experiences, making time feel longer. For adults, days often feel repetitive, and there are fewer new experiences to create lasting memories. This can make time seem to pass more quickly as one grows older.

She also mentioned we better remember the end of things (rather the beginning or middle). She used a medical procedure as an example, asking study participants to rate their level of pain during the procedure. Two people underwent the procedure—one lasting 10 minutes with a sharp peak of pain, and another lasting 25 minutes with the pain diminishing towards the end. The patient with a longer procedure but less pain remembered the experience more favorably because the last part was less painful and didn't end on a high note of pain.

The speaker related this to web performance, suggesting that even if a site is slow, ensuring that the final steps are quick can leave users with a better impression. This means that even if the initial load is slow, a faster final experience can make the overall perception more positive.

I'm not sure exactly what the main message of the talk was (except that time is relative), but there were so many examples that it was very enjoyable to listen to (also, the storytelling was great).

DevTools deep dive

I aslo attended a talk on how to better use DevTools. The talk didn't rank among my favorites, primarily because it didn't align with my interests. I do not spend enough time in the DevTools on my daily routine to really make effective use of the advice.

I also found that the talk lacked storytelling and instead felt more like a detailed list of various features. As I couldn't find the narrative thread connecting it all together, it felt like a series of disconnected points. Still, he showed how to override HTTP headers, modify HTML with local copies, and even "fake" whole URLs directly from the DevTools, which I was impressed with and would definitely use.

Lighthouse CI

I also attended a talk called "Web Performance Testing." As the title seemed quite generic, I checked the description to understand the focus. It appeared to be about Continuous Integration (CI), which piqued my interest because this is another of my fields of interest. Leboncoin had already mentioned using Lighthouse CI in their workflow, so I wanted to learn more about it.

Unfortunately, I found her talk less engaging. It primarily covered Lighthouse and Lighthouse CI, including a brief explanation of Git and GitHub. She discussed how Lighthouse works, how to set up Lighthouse CI, and how to configure it with YAML or JSON files. While this information is useful, I felt I could have acquired the same details by reading the documentation in 25 minutes rather than attending the talk.

I was hoping for deeper insights on best practices for using Lighthouse CI. I would have been interested in learning more about how to use Lighthouse CI effectively in real-world scenarios, including best practices, pitfalls, and limitations. For instance, what is the ideal number of runs required to obtain stable data? Is it necessary to warm it up? Do you run two versions (one for non-regression, based on the current values, and one for improvement, with slightly more aggressive values)?

I would have appreciated learning about her personal experiences, challenges, and practical advice for optimizing Lighthouse CI. The talk focused too much on installation and basic setup, lacking the in-depth, actionable insights I was looking for.

Conclusion

In conclusion, I had a wonderful day at WeLoveSpeed. Many of the talks taught me something new, and I could also continue chatting with people in the hallways. People I had met previously, new faces, as well as organizers and sponsors. I had a blast, and see you next year!

NantesJS #70

16 Feb 2023

Yesterday I was at the NantesJS #70 meetup. I was one of the two speakers (sharing tips on how to write better tech blog posts), but also had the chance to see the first talk of the night.

The meetup took place at SII, and we were about 20 in the room (including organizers and speakers). Considering that it's post-COVID, a rainy day, vacation and strike day all at once, it's a pretty reasonable number.

The rest of this blog post will be a rewriting of my notes when listening to the first talk, for people that didn't get the chance to attend. She also wrote a detailed article on the topic and has a starter kit GitHub repository.

Pure-JavaScript mobile apps

Aleth Gueguen brought us into a tour of the way she works, using standard JavaScript API to build mobile apps.

She develops apps for professional business, used by people on the field, to register information. Think about people working on fixing large machines, they need to take pictures of what is broken, which engine piece to buy, and move to the actual fixing.

Their job is not to fill information into an app; their job involves more manual labor, but they need to use their phone 10-20 times per day, to fill in some information.

More importantly, they often work in environment where Internet access is either not possible (underground for example) or shaky (on a moving train). In that context, it's important that the app reliably works offline.

Thankfully for her, she has the perfect lab to simulate those conditions. She has a boat, and can sail at sea where she has limited access to Internet. She actually also has low access to electricity, which also forces her to be careful about the battery drain of the apps she creates.

Her motto is clear: Keep the complexity low

She aimed at producing frugal software:

She's using standard APIs, and no framework
She doesn't want to have to build/compile anything
She doesn't want more than two dependencies

In the interest of keeping the complexity low, she went the PWA route, so she doesn't have to deal with the slow AppStore and GooglePlay review processes and can update apps by pushing new content.

Considering that her users will use the app a dozen times a day, and never for more than a five minutes at a time, the UI must be clean. No need for fancy gradients or animations. Her users care about getting the job done, not how beautiful the app looks.

Another specific aspect of her apps is that they work as a "backend for one person", as she puts it. Every user is able to add/edit/delete items in a list, which will be synchronized online with the server once a connection is available. But no user has access to the list of items added by another user. The app works as a single-person point-of-view.

She uses indexedDB as the database to store the items locally, and Service Workers to act as a fake proxy. Whenever the app is offline (the default state), data is stored in indexedDB. Interaction with indexedDB is still done through a fake-CRUD interface, with requests intercepted by the Service Worker.

When the app is online, the Service Worker stops intercepting the requests, and can send them directly to the backend server. It can also synchronize its local state data with the server. The server acts as the source of truth.

IndexedDB is an old API, so she uses idb as a wrapper API Client that allows her to interact with it using promises. Still, it's not SQL, so joining or sorting results is hard. This prompted her to think about the schema.

She creates one table per value she needs to store. So instead of having an items table with and id, name, createdAt and image field, she would have names, creationDates and images tables that each store one kind of data (string, dates, blobs) sharing the same id.

For Service Workers, she uses another dependencies, workbox. It simplifies the lifecycle of Service Workers. Service Workers are a low level, asynchronous API. Any code executed into it doesn't block the page rendering, but it also means it can't interact directly with the DOM of the page, and have to exchange messages with the JavaScript living in the page to perform any update.

She also has the Service Workers build the pages asynchronously in the background, before they are actually requested. When the user navigates to such a page, the Service Worker intercepts the request and serves the page from its cache.

To keep the cache fresh, Service Workers constantly refresh pages in the background when one of its constituent changes. For example whenever an item is being edited, the page listing all the items is being updated in the cache.

I find that this is an effective way to handle fresh cache; you regenerate the content on the "server" side whevener it needs to be updated without waiting for the client to make a request.

She also explained some of the hacks she had to put in place to circumvent quirks related to cross-browser compatibility. One was a specific issue with Safari on iOS not able to handle blob data coming from a FormData inside a Service Worker. Highly specific indeed.

Instead, she makes use of a PUT request to her server, passing the binary blob data as the content, and all the other fields as X- headers.

She also mentioned that Service Workers can be killed randomly whenever the browser thinks they are doing nothing. Which meant that sometimes when the Service Worker is waiting for a response from a server and that response takes a long time to arrive, it can get killed before it has had time to register that the data has been updated.

To avoid duplicating content on the backend by pushing again the same content, she first checks (using a HEAD request) if the item she's about to synchronize has already been saved. This adds one more request, but increases stability (nobody wants to have to cleanup duplicated records).

Her whole philosophy of "Less code = More perf" works well, and I enjoyed the talk and the idea of building frugal software.

Not using a framework has clear advantages, has it allows you to directly interact with the core APIs, and not having to wait for their support to be embedded into the core of the framework.

Once again, always bet on standards.

We Love Speed 2021

16 Dec 2021

Today in Lyon, France, was the We Love Speed conference. Its focus is on everything related to web performance. Even if the conference talks were only in French, I'll do this recap in English, to let more people learn from it. I took a lot of notes while attending the conferences, directly in markdown format, and now I'm editing them, during my 4h30 train ride back home. I'm not even going to try to to a high level presentation of the state of webperf today; instead I'll focus on writing short and concise recaps of each talk, with an overall conclusion at the end.

How to optimize 40k sites at once

This was a presentation by PagesJaunes, the french version of YellowPages. Their brand used to be a big thing; before the Internet ever existed. Those yellow pages were the only way to find a professional service in your area of living. Now, they've totally embraced the web and have created a spin-off organization called Solocal.

Solocal is a web agency that specialized in helping the online presence of SMBs by offering a package containing the development of a dedicated website, SEO, social media and ad presence as well as some advanced premium features (like a store locator) on demand. Most of their customers have less than 10 employees, are not tech savvy and don't really know how to use a website anyway, but they know that without one, they won't get customers in today's world.

Most websites created by Solocal follow some dedicated templates (custom design in a premium feature). And because webperf has an impact on SEO, they had to improve the perf of their templates to increase the SEO. Every change they made had a direct impact on thousands of websites using this template.

First talk of the day, and nice (albeit a bit long) introduction as to why webperf are important. This really was wetting my appetite to know more about what they do. Unfortunately, the next part of the talk was supposed to be done by the CTO, which couldn't make it to the conference and recorded a video instead. This was a last minute change to the program, and the conference team didn't had time to properly setup the sound, so it was really hard to understand what he was saying. Anytime someone moved, the floor was creaking louder than the video sound. I had to leave the room after 10mn of trying to understand the content. I figured my time would be better spent elsewhere, so I went downstairs discussing with a few people I met. I hope the final recording will allow us to know more about the tech impact.

How to create a webperf culture in both dev and product

The second talk was much better; it's ranked my second favorite of the day. It was presented by two people from leboncoin (the french equivalent of Craig's List), one from the product team and one from the dev team.

Leboncoin is a pretty large company now, about 1400 employees; 400 of them in the tech team. It grew significantly in the past years, the tech team almost doubling in two years. Today, they have about 50 feature teams, handling about 30M unique visitors per month. Scaling the tech team and keeping that many people organised and synchronized is actually one of their main challenges today.

But back to webperfs. Leboncoin actually started investing a lot in it because of a large perf regression in production they had in 2020. Their homepage was 7s slower than it used to be. They didn't caught it initially (they had no perf monitoring), it's because their own customers and partners started complaining that they realized something was not working properly. And when they saw that it had a direct impact on their revenue, they tackled the issue by setting up a Taskforce to remediate the regression.

The taskforce was made of experts from their various domains (search, ad, authentication, product details, etc). They also requested help from Google and Jean-Pierre Vincent (a webperf consultant, also a speaker at We Love Speed). They extracted a list of 40 things they should work on fixing. As they couldn't fix them all, they knew they had to prioritise them, but where not sure how to do so.

So they started identifying who their median user is, so they could optimize for the median user. Turns out their median user is using a Galaxy S7 on a poor connection with high latency. This was a defining information for them; they knew they had to optimize the mobile display (for a phone that was already 5 years old) on a slow network.

Leboncoin's motto is all about "giving power to people to better live their day to day lives", by buying second hand stuff. So they couldn't really tell their users to "get a better phone". They had to make their website work for slow low end devices. So they took the most important item in their list, deployed a fixed for it, analyzed the performance. Then they went to the second item, deployed a fix for it, and analyzed again. And they went down their list like this until the initial 7s regression was fixed. They even went a bit further.

But they realized that it was a one-shot fix. If they didn't invest in long term performance tracking and fixing, they will have to do it all over again in 6 months. Performance optimization is not a sprint, it's a marathon and you have to continually monitor it. Which is what the did. They started by adding some live performance monitoring, logging the results in Datadog and sending a Slack alert in relevant channels when one metric was above a defined threshold. It did not prevent pushing slow code to production, but at least they had the history and alerts when things went wrong. They monitored only the most important pages (homepage, search results and details), and measured on different devices.

The second step was to be able to catch performance regression before they hit production. They added a check on the bundle size of their JavaScript. This metric is pretty easy to get, and they pluggued this to their GitHub issues, so whenever the bundlesize difference is too large (> 10%) between the PR and the current code; the PR cannot be merged. Again, they tracked the change overtime to have the history.

They also added automated Lighthouse tests in their CI. Lighthouse is not a perfect tool, and its score shouldn't be taken as an absolute truth. Depending on your stack and use case, some metrics are more important than others. Still, it's an invaluable tool to make sure everybody in the team can talk about the same thing. Without this data, it would just have been another opinion. They added thresholds on some of those metrics in the same vein as the bundlesize limit: it it goes too far above a threshold; the PR is blocked. This forced developer, designers and product owners to discuss the decisions, with an objective metric.

The next step was to teach people internally about all those metrics. What they mean and why they are important. They created a set of slides to explain each metric, to each internal audience. For example, they had one talk to explain why the LCP (Largest Contentful Paint) is important to a designer, and how to keep it low. But also the same talk to explain it to a product manager, or developer, with a specific explanation and examples so everybody knows why it's important, even they don't care about them for the same reasons. That way, the whole teams had a shared goal and not opposite objectives.

And the last step is to always keep one step ahead. The webperf field evolves more and more rapidly; there are new elements to learn every few months. Browsers ships new feature that could help or hinder webperf, and people need to be kept up to date with them.

Overall, this whole production issue turned into a taskforce that finally turned into a whole company-wide shift. Webperf talks are common, both by developers, designers and product people. Their team keeps up to date with the latest news in the webperf world, they closely follow what Google or Vercel is doing, and those metrics became KPIs that everybody can understand.

Still, even with all those progress they know that they are not perfect, and they have to optimize for some metrics instead of others. When making one score better, they might make another worse. They're aware of that, but because they have defined which metrics are more important than others, they can usually define if the tradeoff is acceptable.

Why you need a markup expert

Jean-Pierre Vincent then presented one of the most technical talks of the day. Jumping right into it, he picked the example of a homepage with a hero image, some text and a CTA and showed how, in 2021, this could be optimized. The goal is to make sure it delivers its message as fast as possible, even on a low end mobile device with a slow connection.

His talk is pretty hard to recap, because of the amount of (French-only) jokes in it, and the way it was swinging from high-level meta considerations into deep browser-specific hacks. The crux is that JavaScript never solves webperf issues. It only create them. Sure it comes with solutions to cancel the issues it creates, making it neutral, but it will never make your pages load faster, no matter how optimized it is. If you really want to gain in performance, you have to invest in the underlying fundamental standards: HTML and CSS.

We should strive to develop every single component with a "good enough" version that works without JavaScript. Instead of having either nothing or an empty grey block while waiting for the JS to load, we could at least have a MVP without all the bells and whistles, but that at least look like the full component, and act partly like one. He gave the example of a slideshow (carousel) component. With standard HTML and CSS, it is possible to have something that already works pretty well without a single byte of JavaScript.

In real life, we barely have 1% of our users that navigate without JavaScript. A very few proportion manually disables JavaScript, the majority of those 1% are people either behind a corporate proxy that wrongfully blocks scripts, or people on a poor connection that don't yet have their JavaScript downloaded. If you're a small startup, there is no incentive in optimizing for 1% of your users. But if you're a large company, 1% of your users can be hundred of thousands of dollars of revenue. In any case, forcing yourself to build for those 1% without JavaScript will only make you create faster components.

It doesn't mean you need to build for the smaller common denominator and serve this pure HTML/CSS version to all your users. But it should at least be the version they can interact with while your JavaScript is loading. Take for example a datepicker. Datepickers are incredibly useful components that many site needs. But the amount of JavaScript required for it to work properly (handling date formatting in itself is a complex topic) is often quite large. What about using the default, standard, HTML datepicker provided by the browser. And load the full fledged datepicker only when the user will need it (for example on focus of the actual field). That way the intial page load is fast, and the datepicker is required only when it is needed.

Jean-Pierre then moved onto explaining the best ways to load an image if we want it to be displayed quickly. An image is worth a thousand words, as the saying goes, and it is even truer on homepage where nobody will read your text. So you need to have your main image displayed as fast as possible. He warned us about not using background-image in CSS for that (even if it comes with the nifty background cover properties) because some browser will put background images at the bottom of their download priority list, prefering regular <img> tags instead. Modern browsers now have object-fit that is similar to backround-cover but for real image tags. For older browsers, you can hack your way around by adding a fake <img> tag referencing the same image as the one in background-image and adding a display:none.

On chrome the problem is even more complex as the image download priority is calculated based on the image visibility. If it is not in the viewport, it will be put at the bottom of the priority list. This seems pretty clever, but the drawback is that the browsers needs to know the image placement in the page before downloading them, so it needs to download the CSS before it downloads the images. The suggested way around this limitation if you really need to download one image as fast as possible is to add a <link rel="preload"> tag for this image. Preloading is a very interesting concept, but once again we have to be careful not to use it for everything. If we mark all our images for preloading it's like we're not preloading anything.

Once we know how to download an image as soon as possible, we have to make sure to download the smallest viable image. The srcset attribute allow us to define (in addition to the default src attribute) specific images to load based on the current image display size. The syntax is very verbose and can quickly turn pretty complex to maintain as picking the right image depends on three factors: the current screen resolution, the Device Pixel Ratio (retina or not) and the relative size of the image compared to the page. The last two are tricky because screens are getting higher and higher pixel ratio (3 or more) and the relative size of the image is linked to your RWD breakpoints. This creates a larger and larger number of combination, making this whole syntax harder and harder to write.

Still, has this is a standard syntax, directly at the HTML level, it will work on every browser (eventually) and will be much better than any JavaScript-based solution (or even CSS-based solution for that matter).

The last advice he gave us on images was to make sure we are not uselessly downloading an image that is not going to be display (because we don't display them on mobile for example). As usual, the fastest way to transfer bytes on the wire is to not transfer them at all, so if an asset is not going to be used, it should not be sent. But if you have a lot of images to be displayed, you need to ensure you're giving them width and height dimensions, so at least their respective space in the layout is reserved and the page does not jump as images are downloaded.

Speaking of lazy loading images, there is no clear answer if we should be using gray placeholders, blurry version of the images, a spinner or a brand logo while waiting for an image to load. There is no one size fits all, it all depends on the use case, the specific page and the other images around. This question needs to be answered by the design team, not the devs.

There was a lot of content packed into this talk, I would highly suggest you have a look at the recording and the slides (or even book a private consultant gig with him) because I can't make it justice. Still, the last topic he addressed was the font loading. The best way to load fonts being to define a series of fallbacks, from the best to the worst. The best being the font being already installed locally on the user computer, the worst being an old TTF/OTF format to be downloaded online. Then there is the question of font swapping: if the fonts needs to be downloaded, you should at least present the text in a fallback font while the font is loading. If your default font and real font are really similar, the swap should be almost imperceptible. If they are very different, the swap could create a noticeable jump (because the real font has larger/smaller letters it could make buttons appear on two lines after the swap for example). In that case, the suggest trick is to scale the default font up/down so it takes roughly the same size as the final font. That way the swap will seem less brutal.

All those examples were highly interesting, but they will also most probably be outdated in one year or two. The main important thing to remember here is that we need to invest into markup specialists, people that know the underlying HTML and CSS properties, keep up to date with the way they evolve and are integrated by browsers. Knowing all those properties and keeping up to date is a full time job, and you can't expect a front-end engineer to be able to juggle all that information while also keeping up to date with the JavaScript ecosystem (that is evolving at least as fast). It's time we better recognize markup specialists as expert, and what they bring to the webperf front.

Micro-frontends and their impact on webperf at Leroy-Merlin

This one was the most impressive talk of the day. How Leroy-Merlin (5th french e-commerce website) rewrote their whole front into a micro-frontend architecture and what the impact on webperf was.

For a bit of context, Leroy-Merlin has 150 physical stores, they do a mix of online and physical business while most of their competitors are pure players (like ManoMano, which was actually doing a talk in the same room right after this one). But back to Leroy-Merlin: their traffic is mostly (55%) coming from mobile, and the average user journey is 7 pages long. This is going to become important data for the rest of the talk.

The two speakers were tech leads of the front. They were upfront about the KPIs they wanted to optimize: great SEO, quick Time to Market (ability to release new features quickly), fast performances, data freshness and resiliency. They quality/price/availability of the products in store isn't part of their scope. They need to make sure the website loads fast and displays relevant information no matter the conditions.

Before their rewrite, they used to have one large monolith and a dev team of a bit more than 100 devs. This created a lot of frictions in their deployments as everybody had to wait in a queue for releasing their part of the code. Their webperf was good, but they had to manually deploy their servers and had some issues with their load balancer (sticky sessions that dropped customers when a server was down).

Individually those problems weren't too bad. But all together, it meant it was time to restart from scratch and think of a solution that would fix all those problems at once: automated deployments, stateless machines and autonomous teams. For the infra part they embraced the Infrastructure as Code with Docker, and for the front went with a micro front-end architecture, where each page is split into "fragments". They have one fragment for the navigation bar, one for the "add to cart" button, one for the similar items, one for calculating the number of items in stock, etc. Each fragment is owned by a different team (made up of front/back engineers, product owner, manager and designer).

Each team can then pick the best stack for their specific job. The most complex components are made in React (about 5% of them), while the vast majority are made of Vanilla JavaScript. Because they split a large page into smaller, simpler, components they didn't need a heavy framework because each fragment was doing one simple thing. This allowed them to heavily simplify the complexity of their code, leading to a much better Time to Market. Each fragment being like a self-contained component, along with assets and specific logic, it's also easier to remove dead code than when it's sprawled over the whole codebase.

They have a backend UI tool that let them build custom pages by drag'n'dropping fragments (which is also securely saved as YAML configuration files, so they can redeploy with confidence). The final page is then assembled in the backend when requested. It picks the page template (homepage, listing, or product detail), and replaces the 30 or so fragment placeholders with the corresponding fragment code. This fully assembled page is then send to the browser and kept in cache for future request. Thus, the backend job is also heavily simplified. It mostly does templating work, once again reducing the complexity.

One limitation of such an architecture is that any personalization data (current user, number of items in cart, availability of a product) cannot be served directly by the backend, and has to be fetched by the frontend. But because 99% of the page has already been pre-rendered on the server, fetching those data requires only a minimal amount of JS and is quickly executed in the front-end. Because their average user journey is 7 pages long, they decided that it wasn't worth downloading a full JavaScript framework for only 7 pages and so they try to really do most of the stuff in vanilla JavaScript.

But, this choices creates another limitation. Because each fragment is isolated, it means that code is often duplicated. And because no framework is used, it means that all the fancy tooling and helpers that improves the Developer Experience are missing. Also, coding without a framework proved to make hiring harder. For all those reasons, they extracted some of the most common shared components into their own private modules (like the design system, the API connection layer, polyfills, etc) into their own private npm module that each fragment can import. For isolating CSS rules, the prefix each CSS selector with the unique ID of the matching fragment.

Having the full page being split into smaller chunks also allowed them to increase their resilience. They could define which fragments are considered primary or secondary. A primary fragment is needed for the page to work (like display the product, or the "add to cart" button). If this fragment fails to build, for whatever reason, then the page needs to fail loading. On the other hand, secondary fragments (like a "similar item" carousel, or the page footer) are considered secondary and if they fail loading, they are simply ignored and removed from the markup. This allowed them to be more resilient to errors, and better scale in case of high traffic spikes. They went even further and made the secondary fragment lazyload: their JavaScript is loaded only when the fragment is about to enter the viewport, making the first page load really fast.

But that's not all, and they went even further with their caching mechanism. As we've seen above, they cache the backend response of the build pages. But what if the page layout changes? What if a product is no longer in stock and the layout seems to be completely changed? They couldn't use revved urls because they wanted to keep a good SEO and unique URLs. They also didn't want to introduce a TTL because it would have drastically improved the complexity of handling the cache.

Instead, they opted for a reactive approach with a low TTL. Every page is cached in the browser for a short amount of time (I don't remember if they said the exact value, but I expect 1 or 2 seconds). This is low enough so a regular user won't notice, but high enough that thousands of users on Black Friday pressing F5 won't kill the server. But the same page is cached in the server forever. The very clever and tricky part is that they update their server cache whenever their database is updated. The listen to any change in their config database, and if a change requires a cached page to be regenerated, they regenerate it asynchronously. That way users still have fresh data, but the server isn't under a lot of pressure.

In addition to all that, they even have different pages generated based on the User-Agent. An modern browser won't have all the polyfills added, while an old one might have. Some goes for mobiles that might not require some part of the markup/assets, so they are skipped during the page creation, once again for faster load.

I told you it was the most impressive talk of the day. They went very far into the micro-frontend direction, and even beyond, taking full advantage of what its modularization approach made possible. This full rewrite required synchronization of the data, front, back and infra teams and also a full reorganization of the feature teams. This went far beyond a tech project, and had impact on the whole company organization.

SpartacUX, ManoMano's rewrite to micro-frontend

The next conference was pretty similar to the one presented by Leroy-Merlin. This time it was ManoMano, actually one of their competitor, explaining a similar approach the had. Both talks being one after the other, we couldn't help but compare to what we just saw in the talk behind. ManoMano's infrastructure is pretty impressive as well, but Leroy-Merlin went so far ahead it was hard to be as excited about this second talk as I was for the first one. There was also a lot of overlap with what Le Bon Coin presented earlier in the morning about how they track webperf stats in their PR and dashboards.

ManoMano started as a Symfony backend with Vanilla JavaScript. They had trouble recruiting Vanilla JS developers, so they moved the front to React. This hurt their SEO as their SSR wasn't properly working with React. They also still had the previous monolith as the backend, and felt like they were duplicating code on both ends, that their performance was getting even worse, and people in the team were struggling with the new complexity to orchestrate.

So they started the really cleverly named SPArtacUX project. A way to bridge the Single Page Application with a better User Experience. The goal was to have a simple codebase for the dev team, while transferring as few bytes as possible, for faster rendering. They opted for micro-frontend architecture (I see a trend here), using Next.js (I see a trend here as well) because it offered nice SSR and they were already proficient with React. They moved to TypeScript for type robustness and used Sass for CSS. As a side note, I still don't really understand why so many companies keep using Sass for their CSS stack (it's slow, it leaks styles, it's non-standard; Tailwind would be a better choice IMO, especially when you already have a design system).

They also started measuring Web Vitals and bundle size in all their production releases and Pull Requests. They pluggued Lighthouse, WebPageTest, Webpack Bundle Analyzer and Chrome Dev Tools to their CI to feed Datadog dashboard and static reports. When they had enough data to see a trend, they started to optimize. Their first target were the third part tracking scripts that were heavily slowing the page down. Those tags are very hard to remove because they can have a business impact; you cannot remove too much data otherwise you're blind to how your business is performing. They had to get an exhaustive list of everything that was loaded and remove the ones that were no longer used.

Then they had to rewrite a fair number of their components that they thought were responsive, but were actually downloading both a desktop and mobile version and hiding one of the two based on the current devices. This made a lot of HTML/CSS and even sometimes images to download for not even displaying it. They put a CDN in front of all their pages. Just like Leroy-Merlin, they build the pages based on a layout and placeholders to replace with fragments.

They pay special attention at optimizing the loading order of assets, only loading assets that are in the current viewport, lazy loading anything else. They invested a lot of time into code splitting and tree shaking to only load what they really needed in their final build. They also made sure any inline SVG icon asset was only includes once, and the other icons were referencing the first one, avoiding downloadin several times the same heavy SVG icon.

In conclusion, they did a really good job on their rewrite, a bit like a mix of Leroy-Merlin on their micro-frontend split and Le Bon Coin on their webperf automation monitoring; but it felt like I had already seen that today. I'm sure if I would have seen this talk first, I would have been more ecstatic about it.

What is faster than a SPA? No SPA.

The last talk of the day was by Anthony Ricaud, which made a clean and concise debunking of the myth that SPA are inherently faster because they need to only load the diff that changes between two pages. Because he was going against what is a commonly accepted idea, he had to put up in the right mindset first by reminding us of cognitive bias and rhetorical techniques we're all guilty of.

Then he showed, with many example recording (of actual websites we had seen during the day), how a version without SPA (so, with simpler GET requests to a server) was actually faster. The reasoning is pretty simple, and went with what Jean-Pierre Vincent said earlier: JavaScript will never make your pages faster; at best it will offset its slowness.

The main reasons for that are that with a SPA, you need to download a lot of blocking JavaScript which you don't have to with classical HTTP navigation. Also with a SPA, you need to get a JSON representation of your state, transform it into a VDOM, then update the existing DOM. With classical HTTP navigation you can start rendering the DOM on the fly, while you're actually still downloading it though HTTP.

In addition, when doing classical HTTP navigation, your browser UI will let you know if the page is loading, while with a SPA it's up to the SPA to have its own loading indicators (which they usually don't have, or trigger too late). This tied well with what Leroy-Merlin was saying earlier in that for 95% of their fragments, they use pure Vanilla JS, and with Jean-Pierre Vincent once again in that you can already do a lot with pure standard HTML/CSS that JavaScript will only be needed for progressive enhancement.

He then went on doing a demo of HOTWire (HTML Over The Wire), which is an hybrid way that should take the best of both worlds. It would use a limited amount of JavaScript, plugging itself on standard HTML markup, to only refresh part of a page in an obstrusive manner. The idea is to tag parts of our HTML pages with tags indicating that an area should be updated without the whole page being refreshed. The minimal JavaScript framework would then query asynchronously the new page; the server would return an HTML version of the new page, filter only the area it needs to update and swap the old area with the new one in the current page.

To be honest, the idea seems interesting, but the syntax seemed to be a bit too verbose and still a bit uncommon. Made me think of Alpine.js which follows a similar pattern of annotating HTML markup with custom attributes, to streamline JavaScript interaction with it. I'm still unsure if this is a good idea or not; it reminds me of Angular going fully in that direction and it didn't really went well for them, it created an intermediate layer of "almost HTML".

Conclusion

I'm really glad I could attend physically this event. It has been too long since I could go to conferences because of the COVID situation. Having a full day of webperf peeps sharing their discoveries, and seeing how far the webperf field went in the past years has been really exciting. It's no longer a field only for deep tech people passionate about shaving off a few ms here and there, it has now a proven direct impact on SEO, revenue, trust and team organization.

Thanks again to all the organizers, speakers and sponsors for making such an event possible!

Older Newer

Meetups

Search

HumanTalks Paris November 2024

Take Control of Your Time: Scripts to Streamline Routine

Dependency Management

First Rule of Code Club: It's Forbidden to Code Too Much

Successful Slides: Essentials to Know

Conclusion

dotAI 2024 - Day One

Stanislas Polu, LLMs reasoning and agentic capabilities over time

Ines Montani, Reality is not an end-to-end prediction problem: Applied NLP in the age of Generative AI

Merve Noyan, Gain full control of your apps with Open-source AI

Sri Satish Ambati, Democratizing AI with Open Source Multi-Agent Systems: Advancing the Future of Workflow Automation

Lightning Talks ⚡

Steeve Morin, ZML: ML framework for Zig

Yann Léger, Serverless Inferencing: an infrastructure point of view

Pierre Stock, The future of Edge Agents

Dr Laure Seugé & Arthur Talpaert, More empathy and health data protection in AI: announcing a primary care revolution.​

Romain Huet & Katia Gil Guzman, Building with OpenAI: What's Ahead

Conclusion

We Love Speed 2024

AB Testing

Leboncoin

Tight mode and 2-steps waterfall

Font best practices

Perception of time

DevTools deep dive

Lighthouse CI

Conclusion

NantesJS #70

Pure-JavaScript mobile apps

We Love Speed 2021

How to optimize 40k sites at once

How to create a webperf culture in both dev and product

Why you need a markup expert

Micro-frontends and their impact on webperf at Leroy-Merlin

SpartacUX, ManoMano's rewrite to micro-frontend

What is faster than a SPA? No SPA.

Conclusion

Dr Laure Seugé & Arthur Talpaert, More empathy and health data protection in AI: announcing a primary care revolution.