The official blog of Codecademy Blog
Stay up to date with feature releases, events, and much more.
In 2013, I landed in Shanghai for a few meetings. My first few minutes walking around the city led to a conversation with a stranger I met in a bar after his long day of work. We discussed life in Shanghai, where he lived, how long he’d been in Shanghai, and what he did for a living. He told me he had just landed a new job with a programming consultancy in Shanghai, and said it all started with a website— Codecademy.com — through which he’d learned to code.
This scene repeated itself months later in Dublin, where I was meeting James Whelton of CoderDojo. We talked about programming education and noticed the couple next to us were talking about programming too. It turned out that they were both Dublin-based programmers - he for Facebook and she for a software consultancy. She talked about how she was planning to leave her job writing Apex (for Salesforce) to take a job writing Ruby, which she had just learned on Codecademy.
These stories aren’t unique — in fact, they’re a reality for most Codecademy users, 70% of whom live outside the United States. From the beginning, we’ve watched with amazement as Codecademy spread. The day we launched, we expected traffic to die down overnight in California, but we hadn’t taken into account that people were just signing on in other parts of the world. Since then, we’ve kept our global audience in mind with everything we’ve built, realizing that the power of an education transcends borders beyond the city we build Codecademy in and the language we speak.
Codecademy: Bringing Skills to You, Wherever You Are
Today, we’re bringing easy access to a world-class skills education to even more people across the world — hoping they’ll benefit from Codecademy in the same way that our more than 24 million existing learners have. We’ve worked to translate Codecademy to Spanish, Portuguese, and French, with more languages on the way. But that’s not all — we’re working closely to create communities and become embedded in new countries to help new learners all over the world become empowered with the skills they need to succeed in the 21st century. We’ve got amazing partners to help us bring Codecademy to five new countries (along with those that speak their languages!).
The UK made news as the first G8 country to mandate programming education for all primary and secondary schoolers. We’ve worked hand-in-hand with many organizations in the UK over the past few years — sponsoring Code Club as they bring programming education to after school groups, working with the Computing At School network to help connect teachers with resources, and with the government itself to bring programming to classrooms — and we’re now doubling down on our commitment to the UK by opening our first international office in London, headed by Rachel Swidenbank.
Libraries Without Borders (Bibliothèques sans Frontières) has worked tirelessly over the past few years to expand access to literacy across French-speaking countries, among them Haiti, Cameroon, and others. Today, Codecademy is working with Libraries Without Borders to translate Codecademy into French and to implement pilot programs to reduce unemployment and including programming in schools. In addition, Codecademy will be a component of the recently announced Ideas Box (designed by Philippe Starck), a project that will be deployed in refugee camps and disaster zones across the world to empower individuals with the skills to improve their lives. Grants from the public and private sector in France helped to make all of this possible.
The Lemann Foundation is the largest education foundation in Brazil, funding innovation on the K-12 level and elsewhere by fostering innovation inside the country and by bringing international technological developments to students. Codecademy is available in Portuguese today thanks to close work with the Lemann Foundation and will soon launch in several Brazilian pilots. One of our proudest moments was talking to Brazilian teachers a month ago in São Paolo about today’s launch of Codecademy in Portuguese, their native language.
Argentina and Buenos Aires
The Government of Buenos Aires, led by Mauricio Macri, has made an ambitious commitment to bringing skills and programming education to all of their citizens by working together with Codecademy. Jorge Aguado, the head of educational technology for the City, has worked to make sure that Buenos Aires is one of the first cities in South America (and the world!) to make a statement about its digital future, tying programming into every school in Buenos Aires, pursuing a campaign to provide skills to the unemployed, and to train government workers with technology. Both we and the government of Buenos Aires think this is the first commitment of its kind in the South American region and think it’s a terrific template for other cities (and governments) moving forward. Buenos Aires’ commitment is particularly notable given that its Spanish translations will be available to the entirety of the Spanish speaking world.
Estonia’s Tiger Leap program has helped it become one of the most advanced digital economies in the world. We hope to support this commitment by working with the Estonian government to help every Estonian K-12 student learn to program.
Codecademy Speaks Spanish, Portuguese, French, and more!
It’s often said that code is the “language of the 21st century.” We at Codecademy think that code is a language that’s cross-border and truly international, and that our new work internationally is an essential step towards bringing advanced digital skills to people all over the world. We can’t wait to hear stories from the millions of new Codecademy learners to come and from the additional partners we’ll be announcing soon!
Building education for the world isn’t easy -- technically or from a product perspective. Want to work on projects like this? We’re hiring!
As product development becomes more and more data driven, the demand for essential data analysis tools has surged dramatically. Today, we are excited to announce that we've open sourced EventHub, an event analysis platform that enables startups to run their funnel and cohort analysis on their own servers. Getting EventHub deployed only requires downloading and executing a jar file. To give you a taste of what EventHub can do, we set up a demo server to play on located here - demo server with example funnel and cohort analysis queries.
EventHub was designed to handle hundreds of events per seconds while running on a single commodity machine. With EventHub, you don’t need to worry about pricy bills. We did this to make it as frictionless as possible for anyone to start doing essential data analysis. While more details can be found from our repository, the following are some key observations, assumptions, and rationales behind the design.
Basic funnel queries only requires two indices, a sorted map from (event_type and event_time) pair to events, and a sorted map from (user and event_time) pair to events
Basic cohort queries only requires one index, a sorted map from (event_type and event_time) pair to events.
A/B testing and power analysis are simply statistics calculation based on funnel conversion and pre-determined thresholds
Therefore, as long as the two indices in the first bullet point fit in memory, all basic analysis (queries by event_types and date range) can be done efficiently. Now, consider a hypothetical scenario in which there are one billion events and one million users. A sorted map implementation like AVL tree, RB tree, SkipList, etc. can be dismissed as the overhead of pointers would be prohibitively large. On the other hand, B+tree may seem to be a reasonable choice. However, since events are ordered and stored chronologically, sorted parallel arrays would be a much more space efficient and simpler implementation. That is, the first index from (event_type and event_time) pair to events can be implemented as having one array storing even\ttime for each event\type and another parallel array storing event_id, and similarly for the other index. Though separate indices are needed for looking up from event_type or user to their corresponding parallel arrays, as event_type and user are level of magnitude smaller than events, the space overhead is negligible.
With parallel array indices, the amount of memory needed is approximately (1B events * (8 bytes for timestamp + 8 bytes for event id)) * 2 = ~32G, which still seems prohibitively large. However, one of the biggest advantage of using parallel arraysis that within each array, the content is homogeneous and thus compression friendly. The compression requirement here is very similar to compressing posting list in search engines, and with algorithm like p4delta, the compressed indices can be reasonably expected to be <10G. In addition, EventHub made another important assumption that date is the finest granularity. As event id is assigned monotonically increasingly, the event id itself can then be thought of as some logical timestamp. As EventHub maintains another sorted map from date to the smallest event id on that date, all the queries filtered by date range can be translated to queries filtered by event id range. With that assumption, EventHub was able to get rid of the time array and further reduced the index size by half (<5G). Lastly, since indices are event_ids stored chronologically in an array, plus the array is stored as memory mapped file, the indices are very friendly to kernel page cache. Also, assuming most of the analysis only cares about recent events, as long as those tail of the indices can fit in the memory, most of the analysis can still be computed without touch disks.
At this point, as the size of the basic indices is small enough, EventHub would be able to answer basic funnel and cohort queries efficiently. However, since, there are no indices implemented for other properties on events, in case of queries filtered by event properties other than event_type, EventHub still needs to look up the event properties from disk and filter events accordingly. Due to the space and time complexity needed for this type of query is not easy to estimate analytically, but in practice, when we ran our internal analysis at Codecademy, the running time for most of the funnel or cohort queries with event properties filter is around few seconds. To optimize the query performance, the followings are some key features implemented and more details can be found from the repository.
Each event has a bloomfilter to quickly reject event property which doesn't exactly match the filter
LRU Cache for events
Assuming the bloomfilters are in memory, EventHub only needs to do disk lookup for events that actually match the filter criteria (true positive) as well as the false positive events from bloomfilters. As, the size of bloomfilters can be configured, the false positive rate can be adjusted accordingly. Additionally, since most of the queries only involves recent events, to optimize the query performance, EventHub also keeps a LRU cache for events. Alternatively, EventHub could have implemented inverted index like search engines do to facilitate fast equality filters. The primarily reason for adopting bloomfilters with cache is that it doesn't require adding more posting list as new event properties are added, and we believe for most use cases and with proper compression, EventHub can easily cache hundreds of million of events in memory and achieve low query latency.
Lastly, EventHub as is doesn't compress the index and we left that as one of our todo. In addition, the following two features can be easily added to achieve higher throughput and lower latency if that's needed.
Event properties can be stored as column oriented which will allow high compression rate and great cache locality
Events from each user in funnel analysis, cohort analysis, and A/B testing are independent. As a result, horizontal scalability can be trivially achieved from sharding by users.
As always, it's open sourced, and pull requests are highly welcome.
If you like the post, you can follow me (@chengtao_chu) on Twitter or subscribe to my blog "ML in the Valley". Also, special thanks Bob Ren (@bobrenjc93) for reading a draft of this
More and more startups are looking to hire data scientists who can work autonomously to derive valuable insights from data. In principle, this sounds great: engineers and designers build the product, while data scientists crunch the numbers to gain insights. In practice, finding these data scientists and enabling them to be productive are very challenging tasks.
Before diving further, it's useful to note a few trends in data and product development that have emerged over the past decade:
Companies such as Google, Amazon and Netflix have shown that proper storage and analysis of data can lead to tremendous competitive advantages.
It’s now feasible for startups to instrument and collect vast amounts of usage data. Mobile is ubiquitous, and apps are constantly emitting data. Big data infrastructure has matured, which means large-scale data storage and analysis are affordable.
The widely adopted lean startup philosophy has shifted product development to be much more data-driven. Startups now face the challenges of defining Key Performance Indicators (KPI), designing and implementing A/B tests, understanding growth and engagement funnel conversion, building machine learning models, etc.
Because of these trends, startups are eager to develop in-house data science capabilities. Unfortunately many of them have the wrong ideas about how to build such a team. Let me describe three popular misconceptions.
Misconception #1: It's okay to compromise the engineering bar for statistical skills.
For data scientists to work productively and independently, they must be able to navigate the entire technical stack and work effectively with existing systems to extract relevant data. The only exception is if a startup has already built out its data infrastructure. But in reality, very few startups have their infrastructure in place before building a data science team. In these cases, a data science team without strong engineering skills or engineering support will have a hard time doing their job. At best, they will produce suboptimal solution that will be rewritten by another team for production.
To illustrate this, take the example of building the KPI dashboard at Codecademy. Before visualizing the data in d3, I had to extract and join (a) user data from MongoDB, (b) cohort data from Redis, and (c) pageview data from Google Analytics. The data collection alone would've been near impossible without an engineering background, let alone making the dashboard real-time, modularized and reusable.
Misconception #2: It's okay to compromise the statistics bar for engineering skills.
Proper interpretation of data is not easy, and misinterpreted data can do more damage than data that's not interpreted at all (check out "Statistics done wrong"). Building useful machine learning (ML) models is trickier than most people expect. A popular but misguided view holds that ML problems can be solved either by applying some black box algorithm (a.k.a magic), or by hiring interns who are PhD students. In practice, hundreds of decisions and tradeoffs are made in solving such problems, and knowing which decisions to make requires a lot of experience. (I’ll expand upon this more in a future post titled “Machine learning done wrong”.) For a given ML problem, there are tens if not hundreds of way to solve it. Each solution makes different assumptions and it's not obvious how to navigate and identify which assumptions are reasonable and which model should be used. Some would argue: why not just try all different approaches, cross validate, and see which one works the best? In reality, you never have the bandwidth to do so. A strong data scientist might be able to produce a sensible model right off the bat, while a weak one might spend months optimizing the wrong model without knowing where the problem is.
Misconception #3: It's okay to hire data scientists who lack product thinking.
Imagine asking someone who doesn't have a holistic view of the product to optimize your business KPIs. They may prematurely optimize the sign up funnel before making sure the product has reasonable retention, which would lead to more unretained users. Some think data-driven product development is a local optimization. This criticism is only correct when those who drive product development with data fail to think about the product holistically.
To sum up, a productive data science team requires data scientists that are strong in engineering, statistics, and product thinking. It's hard. And it becomes even harder to look for the first data science hire who will be spearheading data efforts in a startup. For startups that don't have the luxury to wait and hire these rare data scientists, it's important to be aware of the compromises made especially in terms of the hiring bar. Before the data team is strong enough across all three areas, make sure they have strong support for the skills they lack, and don't expect them to work autonomously.
If you like the post, you can follow me (@chengtao_chu) on Twitter or subscribe to "ML in the Valley". Also, special thanks Ian Wong (@ihat), Leo Polovets, and Bob Ren (@bobrenjc93) for reading a draft of this.
We’ve been working hard over the past four months trying to reimagine Codecademy and we couldn’t be happier to finally unveil it to the world. We have redefined every component under our brand, from a single button on our dashboard to our email template, business cards, slides and even apparel.
We had been discussing a design refresh for a while, but somehow it always ended up being pushed to the side. Finally, in October last year, after completing a user segmentation project that brought to live the main user archetypes of Codecademy.com, it quickly became apparent that if we wanted to grow and mature as a brand, we required a thorough redesign of our entire product.
Why a redesign?
Reason #1 – Start fresh
First, there was the obvious problem of design incoherence and variation. This happened primarily because we lacked a well-defined color and font palette, a uniform visual language for our badges, a unified layout scheme (page types), and a cohesive strategy for all print materials – business cards, postcards, posters, etc. After two and a half years of multiple nip and tuck design fixes and additions, it was time to clean up the house and start fresh. This meant we could finally create an extensible UI pattern library (used and shared by designers and developers) and optimize our new face across multiple platforms by embracing a responsive design layout.
A random sampling of pages within our old web ecosystem, showcasing some visual design inconsistency.
Reason #2 – Brand matureness
Secondly, was the realization that our young startup look and feel was slowly becoming incompatible with our future goals and aspirations. In a time when we are engaged in several partnerships with schools, companies, and governments across the globe, while also continuing to fulfill the needs of our growing user base, our brand should feel a bit more mature, inviting, professional, and sophisticated.
Codecademy’s quirky and undistinguished old logo was created by one of our co-founders in a few minutes by browsing through various fonts in a word processor. The logo featured the giddy lobster font, which has become so popular that is at times compared with Microsoft’s Comic Sans.
Our new look
Back in early November last year, we partnered with our friend Eddie Opara, and his immensely talented team of designers at Pentagram, in order to create a new visual identity that could better reflect the company’s age, ambition, and main attributes.
The first thing we tackled was the logo, as the key centerpiece of our new look. We spent some time talking to our users, colleagues, and our founders Zach and Ryan, to have a solid grasp of Codecademy's perception and future aspirations. After this important research period, we went through several revisions, continuously narrowing down on the mark that best represented our main traits.
While putting the finishing touches to our new logo, we began creating a complementary color, font and iconography palette. It was important to handle all these components simultaneously, so we could delineate a consistent design thread through all of them. Phase 1 gave us the most critical building blocks of our new brand, through our partnership with Pentagram, and marked the beginning of an exclusive in-house development period.
Various early directions for our new logo.
Narrowing down on a few favorite visual marks.
Our final logo with its underlying grid.
Our new graphical language used across the site to indicate different types of content (symbols), actions and controls (icons), and learning achievements (badges).
FF DIN Rounded - Our primary typeface.
Our new color palette.
After defining the main brand pieces with Pentagram (logo, typography, iconography, color), we started applying it internally to our entire web ecosystem by building a comprehensive number of reusable design patterns. For two weeks we built a sizable UI toolkit covering a variety of elements (see below).
Our first attempt at the UI toolkit encompassing only a short amount of elements.
Our growing UI toolkit covering every element, such as header, footer, form fields, button styles, sign up modules, grid, padding, typography, colors, and interactions.
This was the longest, and perhaps most exhausting, of all phases, where we redesigned 70+ webpages in tandem with other collateral material (email templates, slides, apparel, etc). Fortunately, we imposed ourselves a very well defined timeline, with multiple cycles and milestones, which helped us guide through this large task (see sitemap below).
First, we created a comprehensive sitemap of Codecademy.com and then divided the sitemap into four groups, each representing a 2-week delivery cycle. As we redesigned the various pages in each cycle, our brilliant team of developers built our UI styleguide and constructed many of the pages based on the shared design patterns.
Examples of our redesigned pages, from left to right: Enterprise, Stories, Profile.
Examples of our redesigned pages, from left to right: Blog, About, Jobs.
Examples of our redesigned pages, from left to right: Help Center, UK Curriculum, Hour of Code.
Our final phase was all about making sure we were building the thing right. We implemented and tested our new redesign, while in the process getting feedback from our community. We created a huge amount of redlines for all the new material, started experimenting with some versions live on the site, and listened to dozens of comments from our selected users and moderators.
An example of the various comps created to support the accurate implementation of all our redesigned pages.
Even though we spent a long time rethinking Codecademy, this hefty work is still unfinished. It certainly provides our team and product with a much-needed fresh face, one that we can feel proud of, and most importantly, one that our users can thoroughly enjoy. But this is just the beginning. We would love to hear what you have to say about our redesign and how we can continue improving our product. We have dozens of ideas to continue pushing this brand foreword. Please keep coming back for more!
Two years ago, we started building a product that would help teach people the skills they needed to succeed in a digital world. As more than 24 million people took Codecademy courses on our web and iOS platforms, we too learned and grew. Now, we’re excited to show you our latest project — a new Codecademy designed from the ground up, aimed to help you learn skills hands-on, with real projects, and constant feedback. Better yet, the new Codecademy experience helps to connect you with the real skills you’ll need to succeed in today’s workplace.
Learning isn’t just about one exercise or “class,” but instead is a gateway to community, opportunities, ideas, and a better life. We’ve witnessed this through the millions of learners on Codecademy and through the thousands of inspiring teachers who have shared their knowledge with the world with our course creator. We listened to them while building what we think is the best learning experience — for anyone, anywhere — to learn the most important skills of today.
In two years, Codecademy has scaled to become larger than we had ever imagined. Our learners, spread across the globe in every country in the world, have:
- written more than a billion lines of code
- joined more than 24 million others in starting the journey of learning to code
- helped to create more than 100,000 courses using our course creator
- hosted meetups in more than 350 cities
- learned on-the-go through our mobile apps for iPhone and iPad, both of which have been featured in the App Store
- worked with nearly 100 partner organizations like The White House and Twitter to both learn and spread their knowledge even further
Today, we’re proud to show off the results of all of that to a few friends and, within days, the rest of the world. The first fruits of this effort are an experience that gets you from knowing nothing to building a website — in this case, Airbnb’s homepage. Along the way, you’ll experiment with blocks of code, see the results of adding and subtracting different parts of a page, and use the real terminology that developers and designers all over the world over use to create websites just like Airbnb’s.
Our new platform leaves you not just with new knowledge, but with a portfolio of projects you can share with your friends, enabling them to learn from you. We’ve even built the capability for you to share your work with future employers, and to demonstrate your new skills. We’ve been testing our new learning interface for weeks and we’ve seen it applied in an amazing number of ways — from designers at major firms winning new consulting work because of their ability to build their designs to students in high school making personal webpages for themselves.
Codecademy’s learning experience comes not just from the data behind 24 million learners and billions of lines of code, but also from the individual stories we’ve heard from our wealth of committed learners. Former book critic Juliet Waters, for instance, started learning with her 11-year old son as part of our Code Year program in 2012. Since then, she’s gone on to chronicle her journey in a book that’s coming soon, noting that programming helped her feel “more connected with others in our tech-driven society.” A parent named Shari told us that her 11 and 13-year old sons had a “reaction to what they are learning [that] beats their enthusiasm for the [video games].” We work hard everyday to deliver a similar experience for our users all around the world — with more than 65% of users outside the US, it’s important to us that we’re democratizing access to the fundamental blocks of knowledge that can improve peoples’ lives.
Tommy Nicholas’ story is just the sort we’re hoping our new learning environment will foster: he began with almost no programming knowledge at all, and gained enough skills to develop a website, Coffitivity, that was named one of TIME Magazine’s Top 50 websites in 2013.
Billions of lines of code, millions of users, and years after our founding, we’ve been astonished by what people can do when they can easily learn the fundamental skills that can transform their lives. Today, we’ve redesigned Codecademy to reflect that potential — and hopefully to help more people reach their goals and build the future they want to live in.
If you want to help us, we're hiring!
Codecademy started to help anyone learn the skills they need in order to succeed in the twenty-first century. We want Codecademy to be with you everywhere - learning shouldn't be confined to a classroom or a desktop computer.
We launched our first iPhone app, Codecademy: Hour of Code, for anyone in the world to get started learning to code on the go. We've built an entirely new Codecademy experience for mobile that includes the same things that make Codecademy on the web great - interactivity, "snack" sized content, and fun lessons. Our first app gives you the basics of programming and should help absolutely anyone get started with programming - it's almost too easy not to try!
We'll send content updates to this app with more courses for you to complete as time goes on. You'll see more feature updates as well.
Perhaps best of all, this is just the beginning of Codecademy on the go. We want to help you learn the skills that can help you change your life - anywhere and anytime. Download our first app and let us know what you think!
Teacher training is critically important and Codecademy is pleased to partner with CAS to be a part of the solution. Teachers can use Codecademy resources after they've attended training sessions to continue building their skills, and remote teachers can access the platform if they are unable to attend in person training sessions. Below is the official announcement:
From September 2014 schools in England will teach a new statutory computing curriculum, which aims to ensure all students can understand and apply the fundamental principles and concepts of computer science. This will make England the educational envy of almost every other country in the world, but it will also be a major step change from what schools currently teach. Not surprisingly this has left many teachers looking for support and further training.
CAS is running a national Network of Excellence for Teaching Computer Science, that aims to provide exactly this support and training. Codecademy, based in New York, will complement CAS’s in-person approach with a free online platform and interactive learning resources specifically designed to support the programming aspects of the new computing curriculum in England. Teachers can use it to learn programming themselves, or as a way to teach programming to their students.
Simon Peyton-Jones, Chair of CAS says “The UK has tens of thousands of teachers who need support and encouragement to deliver the new Computing curriculum with confidence and enthusiasm. Codecademy offers us the scalability of an online platform, and teachers can move smoothly from learning programming themselves to Codecademy to teach their students. I’m delighted to have this support.”
It's that time of the year again. The air is crisper, lattes are spiced with pumpkin-flavor extract, and Codecademy is coming to a campus near you!
Codecademy is taking to the road for a whirl-wind college tour up and down the eastern seaboard. We've had a great time sharing stores from our college fellowship program and spilling the technical details on how we evaluate up to 5,000 code submissions a second (over 25 million a day!)
So far, we've had a turnout at Brown that would scare any fire marshall and we've stayed up all night to hack with the best and brightest at MIT's blow-out HackMIT event. But we're just getting started, so come join us if we're visiting a campus near you!
Hope to see you soon!
MIT tech talk
10/8 at 7pm
Bldg 5, Room 233
Olin tech talk
10/9 at 7pm
Academic Center 126
10/11 at 9pm
MIT On campus interviews
Apply on careerbridge, if interested
A few of us at Codecademy spent the weekend at MIT for HackMIT - a hackathon involving over 1000 college students from around the world. It's been an awesome experience seeing so many coders in one location hacking away at amazing projects.
The atmosphere inspired us to start hacking ourselves. For our hack, we decided to programmatically find the most active Codecademy users at HackMIT and give them some love and swag.
By querying our database for all of the emails of attendees at HackMIT, we were able to find all the Codecademy users in attendance. We then sorted the list of users by points and achievements to find the top 3 most active Codecademy users at HackMIT.
Here is a photo of us with our top HackMIT user - spiltpeasoup!
We were surprised to find out that over 25% of hackers at HackMIT have a Codecademy account! If you are at HackMIT this weekend, stop by and say hi!
As some of you may have realized, Friday morning at about 10:00am, our site was not operable for 2 hours. We apologize for the inconvenience and wanted to explain to you why this happened.
Our hosting provider, Amazon Web Services (AWS), was having networking issues. This affected our app servers, app load balancer, and redis boxes. Some of you may have noticed a 503 error, which was thrown by our CDN (content delivery network). During these two hours, we are able to restore the site, but because of the networking issues, the site was very slow. At 12:07 Amazon Web Services restored the issue, and the site was back up and running as normal. Because this was a networking issue, no content or progress was lost.
Again we apologize for any inconvenience caused by this downtime. Unfortunately this particular issue was out of our control. We're investigating ways we can add greater redundancy to Codecademy to help ensure we're protected from similar issues in the future.
See what we posted on twitter Friday morning.