The official blog of Codecademy Blog
Stay up to date with feature releases, events, and much more.
We’re proud to introduce our new Learn AngularJS course!
Why Learn AngularJS?
We can’t wait to see what you create!
We’re proud to announce the launch of our brand new Ruby on Rails: Authentication course!
In this course, you'll learn how to build an authentication system and an authorization system from scratch. When you’re finished, you'll be able to write your own custom authentication system as well as use third-party systems.
Want more advanced content? You got it—we built Ruby on Rails: Authentication for learners who have finished our Learn Rails course eager to go on to more advanced topics that will help them create their own web projects and further their skills.
Why Authentication and Authorization?
Many web apps let users sign up for a new account as well as log in and out of their accounts. Together, signing up, logging in and logging out make up an authentication system. Most apps use some form of authentication to ensure that only signed in users can access content.
In addition to authentication, many web apps have a way to give specific users permission to access certain parts of the site. For example, a blog would give only its authors or admins permission to access the editing and publishing parts of the site. Permissions are defined with an authorization system.
Expand your web app by learning authentication and authorization with our latest course. We can’t wait to see your projects!
We’re proud to announce the launch of our new Learn Rails course!
Ruby on Rails is a popular web framework used by companies like Airbnb, GitHub, Groupon, and Codecademy that makes it easy to build dynamic web apps in a short amount of time. You can use Rails to build apps, websites, games, and more—all in an easy to use and popular framework.
When building Learn Rails, we had a few basic concepts in mind. We wanted the course to be accessible, quick to complete, and fun to do. We decided to shorten the Learn Rails course time and created 12 hands-on projects that users can work on throughout the course. Coding is exciting, and the new Learn Rails format makes it easy to grab on to concepts and apply them in useful projects.
Why Learn Rails?
You might notice that Learn Rails is our second Rails course on Codecademy. We listened carefully to the response from students after our first course—Make a Rails App. Students from our in-person Codecademy Labs classes felt they needed more practice before they could create a Rails app on their own. We responded by creating a new course, Learn Rails, to give all of our learners the skills they need to feel confident in coding custom Rails apps.
At Codecademy, we try to learn as much as possible about how learners use our product, and how we can improve. We’re dedicated to our “learn by doing” experience, helping users feel rooted in what they learn, and confident in being able to build their own real-world Rails apps.
We can’t wait to see what you create with Learn Rails!
-Bana, Content at Codecademy
In his 2015 State of the Union Address, the President singled out technical education as a cornerstone of the effort to skill and re-skill Americans for the jobs of today. The President’s focus on equipping Americans with the tech skills necessary to keep the country competitive for the jobs of the future puts Codecademy one step closer to our mission of training and employing previously-unskilled workers in American technology jobs increasingly available across all industries.
Over the last three years, we’ve worked to build the easiest way to learn to code online. It’s been our belief since the beginning that any effort to keep Americans competitive and fully employed in the labor market of the 21st century would need to center around re-education, and that effective coding education could be built to reach anyone with an internet connection.
In November, we partnered with the leading non-traditional providers of coding education across the country to launch ReskillUSA, a partnership designed to educate Americans on the range of options that exist to teach absolute beginners the skills they need to find meaningful work.
Codecademy, alongside the educators involved in ReskillUSA, is thrilled that the president has come out in support of re-educating Americans to meet the demands modern labor market. We hope that his announcement helps focus debate about the American skills gap on how to re-educate our technical workers, and are grateful for The White House’s support as we continue to equip Americans with the skills for tomorrow’s technical jobs.
So far this year, we have helped bring coding to classrooms in over a 1000 schools across England, and now we want to expand our reach and bring the skills required for 21st century jobs to Universities all over the country.
What does being a Brand Ambassador mean?
We are looking for passionate and innovative university students to represent Codecademy on their university campus. A brand ambassador’s main goal will be to improve awareness of Codecademy as a free resource, to improve an individual's chances of getting a job. We want our brand ambassadors to hold Codecademy events and workshops, create communities of users who can support each other and provide outreach to local schools to continue our support to teachers in delivering the new computing curriculum. We want you to dedicate at least 5 hours per week to helping create these communities, and want you to hold the position for a full academic year.
What is in it for you?
You will be the face of Codecademy on Campus! We believe brand ambassadors will gain invaluable experience in event management, marketing, coaching and training. You will also gain insight into a fast growing start up as we expand internationally, with the opportunity to help shape Codecademy’s strategy and vision. This role will also offer great opportunities to network in the local start-up scene, summer internship opportunities and of course you will receive some awesome Codecademy merchandise!
How do I apply?
We accept single and group applications. All we require is a familiarity with Codecademy and that all applicants are currently students at a UK University. If you are interested, please send your CV and answers to the questions below with the subject line "Codecademy Brand Ambassador" to Rachel Swidenbank at email@example.com. The deadline for applicants is 8/10/2014, but ongoing applications will be reviewed.
- Why is coding so important to your generation?
- How could you create a Codecademy community in your University?
- Why do you think Codecademy should support schools?
Starting today, we’re introducing a new, redesigned Dashboard that unifies the old dashboard with the course catalog. We’ve done this for two main reasons: to make it easier to browse the content on Codecademy by putting all the skills you can learn in a single page, and secondly, to make it easier to gauge your overall progress on the site. Now, on the new Dashboard, you’ll be able to see which skills you’ve started learning—as well as your progress in them,—and which ones you’ve completed.
You’ll also notice that we’ve opened up some room in the header. Don’t worry, your points are still alive and, as before, you can look at how you’re doing by accessing your Profile. Clicking the Codecademy logo will lead you to the new, unified Dashboard page. As for the Teach page link, you’ll now find it in the footer.
The final change you’ll notice is on the Profile. Now, your Profile will show only the skills you’ve completed. This is also where you’ll find Codebits.
We’re excited about these changes and we’d love to hear what you think. We’re committed to constantly improving your learning experience, so stay tuned for more updates!
I’m proud to announce that Codecademy is partnering with DonorsChoose.org and Google in an effort to double the number of high school girls studying Computer Science. Google.org has committed $1 million to fund $125 DonorsChoose rewards for girls who complete a special Codecademy course. Meanwhile, teachers can earn an additional $500 in classroom rewards when four of their students make it through the course.
Education in critical skills should be accessible to everyone, regardless of their gender, where they live, their income, or any other factor. When we started Codecademy, there was a gap between the skills needed to find a job and the education available to students across the world. Programming, in particular, showed a massive achievement gap, with females representing only 12% of Computer Science degree graduates. We think fixing this problem requires efforts on many fronts: increasing the availability of education, providing role models for future students, and incentivizing students and their teachers to take the first step towards learning CS. We hope that today’s announcement will bolster the number of students who are exposed to Computer Science and choose to study it later in their academic careers.
Today’s partnership has another benefit beyond introducing more women to Computer Science. By working with DonorsChoose.org, we’re ensuring that the $1m in rewards that are disbursed help to improve access to technology and classroom materials. Every time a student completes a Codecademy course, they are helping to purchase new materials, like tablet PCs, textbooks, and more, for their classrooms. Not only are we better preparing our students, but we’re better preparing our classrooms as well.
At Codecademy, we spend every day focusing on how to increase access to the skills that everyone needs to find a job in the twenty-first century. Today’s announcement is one part of the solution, but it’s not the only one. Codecademy, along with DonorsChoose.org, Google, and so many others, will continue to work to make sure a great education is accessible to all.
In 2013, I landed in Shanghai for a few meetings. My first few minutes walking around the city led to a conversation with a stranger I met in a bar after his long day of work. We discussed life in Shanghai, where he lived, how long he’d been in Shanghai, and what he did for a living. He told me he had just landed a new job with a programming consultancy in Shanghai, and said it all started with a website— Codecademy.com — through which he’d learned to code.
This scene repeated itself months later in Dublin, where I was meeting James Whelton of CoderDojo. We talked about programming education and noticed the couple next to us were talking about programming too. It turned out that they were both Dublin-based programmers - he for Facebook and she for a software consultancy. She talked about how she was planning to leave her job writing Apex (for Salesforce) to take a job writing Ruby, which she had just learned on Codecademy.
These stories aren’t unique — in fact, they’re a reality for most Codecademy users, 70% of whom live outside the United States. From the beginning, we’ve watched with amazement as Codecademy spread. The day we launched, we expected traffic to die down overnight in California, but we hadn’t taken into account that people were just signing on in other parts of the world. Since then, we’ve kept our global audience in mind with everything we’ve built, realizing that the power of an education transcends borders beyond the city we build Codecademy in and the language we speak.
Codecademy: Bringing Skills to You, Wherever You Are
Today, we’re bringing easy access to a world-class skills education to even more people across the world — hoping they’ll benefit from Codecademy in the same way that our more than 24 million existing learners have. We’ve worked to translate Codecademy to Spanish, Portuguese, and French, with more languages on the way. But that’s not all — we’re working closely to create communities and become embedded in new countries to help new learners all over the world become empowered with the skills they need to succeed in the 21st century. We’ve got amazing partners to help us bring Codecademy to five new countries (along with those that speak their languages!).
The UK made news as the first G8 country to mandate programming education for all primary and secondary schoolers. We’ve worked hand-in-hand with many organizations in the UK over the past few years — sponsoring Code Club as they bring programming education to after school groups, working with the Computing At School network to help connect teachers with resources, and with the government itself to bring programming to classrooms — and we’re now doubling down on our commitment to the UK by opening our first international office in London, headed by Rachel Swidenbank.
Libraries Without Borders (Bibliothèques sans Frontières) has worked tirelessly over the past few years to expand access to literacy across French-speaking countries, among them Haiti, Cameroon, and others. Today, Codecademy is working with Libraries Without Borders to translate Codecademy into French and to implement pilot programs to reduce unemployment and including programming in schools. In addition, Codecademy will be a component of the recently announced Ideas Box (designed by Philippe Starck), a project that will be deployed in refugee camps and disaster zones across the world to empower individuals with the skills to improve their lives. Grants from the public and private sector in France helped to make all of this possible.
The Lemann Foundation is the largest education foundation in Brazil, funding innovation on the K-12 level and elsewhere by fostering innovation inside the country and by bringing international technological developments to students. Codecademy is available in Portuguese today thanks to close work with the Lemann Foundation and will soon launch in several Brazilian pilots. One of our proudest moments was talking to Brazilian teachers a month ago in São Paolo about today’s launch of Codecademy in Portuguese, their native language.
Argentina and Buenos Aires
The Government of Buenos Aires, led by Mauricio Macri, has made an ambitious commitment to bringing skills and programming education to all of their citizens by working together with Codecademy. Jorge Aguado, the head of educational technology for the City, has worked to make sure that Buenos Aires is one of the first cities in South America (and the world!) to make a statement about its digital future, tying programming into every school in Buenos Aires, pursuing a campaign to provide skills to the unemployed, and to train government workers with technology. Both we and the government of Buenos Aires think this is the first commitment of its kind in the South American region and think it’s a terrific template for other cities (and governments) moving forward. Buenos Aires’ commitment is particularly notable given that its Spanish translations will be available to the entirety of the Spanish speaking world.
Estonia’s Tiger Leap program has helped it become one of the most advanced digital economies in the world. We hope to support this commitment by working with the Estonian government to help every Estonian K-12 student learn to program.
Codecademy Speaks Spanish, Portuguese, French, and more!
It’s often said that code is the “language of the 21st century.” We at Codecademy think that code is a language that’s cross-border and truly international, and that our new work internationally is an essential step towards bringing advanced digital skills to people all over the world. We can’t wait to hear stories from the millions of new Codecademy learners to come and from the additional partners we’ll be announcing soon!
Building education for the world isn’t easy -- technically or from a product perspective. Want to work on projects like this? We’re hiring!
As product development becomes more and more data driven, the demand for essential data analysis tools has surged dramatically. Today, we are excited to announce that we've open sourced EventHub, an event analysis platform that enables startups to run their funnel and cohort analysis on their own servers. Getting EventHub deployed only requires downloading and executing a jar file. To give you a taste of what EventHub can do, we set up a demo server to play on located here - demo server with example funnel and cohort analysis queries.
EventHub was designed to handle hundreds of events per seconds while running on a single commodity machine. With EventHub, you don’t need to worry about pricy bills. We did this to make it as frictionless as possible for anyone to start doing essential data analysis. While more details can be found from our repository, the following are some key observations, assumptions, and rationales behind the design.
Basic funnel queries only requires two indices, a sorted map from (event_type and event_time) pair to events, and a sorted map from (user and event_time) pair to events
Basic cohort queries only requires one index, a sorted map from (event_type and event_time) pair to events.
A/B testing and power analysis are simply statistics calculation based on funnel conversion and pre-determined thresholds
Therefore, as long as the two indices in the first bullet point fit in memory, all basic analysis (queries by event_types and date range) can be done efficiently. Now, consider a hypothetical scenario in which there are one billion events and one million users. A sorted map implementation like AVL tree, RB tree, SkipList, etc. can be dismissed as the overhead of pointers would be prohibitively large. On the other hand, B+tree may seem to be a reasonable choice. However, since events are ordered and stored chronologically, sorted parallel arrays would be a much more space efficient and simpler implementation. That is, the first index from (event_type and event_time) pair to events can be implemented as having one array storing even\ttime for each event\type and another parallel array storing event_id, and similarly for the other index. Though separate indices are needed for looking up from event_type or user to their corresponding parallel arrays, as event_type and user are level of magnitude smaller than events, the space overhead is negligible.
With parallel array indices, the amount of memory needed is approximately (1B events * (8 bytes for timestamp + 8 bytes for event id)) * 2 = ~32G, which still seems prohibitively large. However, one of the biggest advantage of using parallel arraysis that within each array, the content is homogeneous and thus compression friendly. The compression requirement here is very similar to compressing posting list in search engines, and with algorithm like p4delta, the compressed indices can be reasonably expected to be <10G. In addition, EventHub made another important assumption that date is the finest granularity. As event id is assigned monotonically increasingly, the event id itself can then be thought of as some logical timestamp. As EventHub maintains another sorted map from date to the smallest event id on that date, all the queries filtered by date range can be translated to queries filtered by event id range. With that assumption, EventHub was able to get rid of the time array and further reduced the index size by half (<5G). Lastly, since indices are event_ids stored chronologically in an array, plus the array is stored as memory mapped file, the indices are very friendly to kernel page cache. Also, assuming most of the analysis only cares about recent events, as long as those tail of the indices can fit in the memory, most of the analysis can still be computed without touch disks.
At this point, as the size of the basic indices is small enough, EventHub would be able to answer basic funnel and cohort queries efficiently. However, since, there are no indices implemented for other properties on events, in case of queries filtered by event properties other than event_type, EventHub still needs to look up the event properties from disk and filter events accordingly. Due to the space and time complexity needed for this type of query is not easy to estimate analytically, but in practice, when we ran our internal analysis at Codecademy, the running time for most of the funnel or cohort queries with event properties filter is around few seconds. To optimize the query performance, the followings are some key features implemented and more details can be found from the repository.
Each event has a bloomfilter to quickly reject event property which doesn't exactly match the filter
LRU Cache for events
Assuming the bloomfilters are in memory, EventHub only needs to do disk lookup for events that actually match the filter criteria (true positive) as well as the false positive events from bloomfilters. As, the size of bloomfilters can be configured, the false positive rate can be adjusted accordingly. Additionally, since most of the queries only involves recent events, to optimize the query performance, EventHub also keeps a LRU cache for events. Alternatively, EventHub could have implemented inverted index like search engines do to facilitate fast equality filters. The primarily reason for adopting bloomfilters with cache is that it doesn't require adding more posting list as new event properties are added, and we believe for most use cases and with proper compression, EventHub can easily cache hundreds of million of events in memory and achieve low query latency.
Lastly, EventHub as is doesn't compress the index and we left that as one of our todo. In addition, the following two features can be easily added to achieve higher throughput and lower latency if that's needed.
Event properties can be stored as column oriented which will allow high compression rate and great cache locality
Events from each user in funnel analysis, cohort analysis, and A/B testing are independent. As a result, horizontal scalability can be trivially achieved from sharding by users.
As always, it's open sourced, and pull requests are highly welcome.
If you like the post, you can follow me (@chengtao_chu) on Twitter or subscribe to my blog "ML in the Valley". Also, special thanks Bob Ren (@bobrenjc93) for reading a draft of this
More and more startups are looking to hire data scientists who can work autonomously to derive valuable insights from data. In principle, this sounds great: engineers and designers build the product, while data scientists crunch the numbers to gain insights. In practice, finding these data scientists and enabling them to be productive are very challenging tasks.
Before diving further, it's useful to note a few trends in data and product development that have emerged over the past decade:
Companies such as Google, Amazon and Netflix have shown that proper storage and analysis of data can lead to tremendous competitive advantages.
It’s now feasible for startups to instrument and collect vast amounts of usage data. Mobile is ubiquitous, and apps are constantly emitting data. Big data infrastructure has matured, which means large-scale data storage and analysis are affordable.
The widely adopted lean startup philosophy has shifted product development to be much more data-driven. Startups now face the challenges of defining Key Performance Indicators (KPI), designing and implementing A/B tests, understanding growth and engagement funnel conversion, building machine learning models, etc.
Because of these trends, startups are eager to develop in-house data science capabilities. Unfortunately many of them have the wrong ideas about how to build such a team. Let me describe three popular misconceptions.
Misconception #1: It's okay to compromise the engineering bar for statistical skills.
For data scientists to work productively and independently, they must be able to navigate the entire technical stack and work effectively with existing systems to extract relevant data. The only exception is if a startup has already built out its data infrastructure. But in reality, very few startups have their infrastructure in place before building a data science team. In these cases, a data science team without strong engineering skills or engineering support will have a hard time doing their job. At best, they will produce suboptimal solution that will be rewritten by another team for production.
To illustrate this, take the example of building the KPI dashboard at Codecademy. Before visualizing the data in d3, I had to extract and join (a) user data from MongoDB, (b) cohort data from Redis, and (c) pageview data from Google Analytics. The data collection alone would've been near impossible without an engineering background, let alone making the dashboard real-time, modularized and reusable.
Misconception #2: It's okay to compromise the statistics bar for engineering skills.
Proper interpretation of data is not easy, and misinterpreted data can do more damage than data that's not interpreted at all (check out "Statistics done wrong"). Building useful machine learning (ML) models is trickier than most people expect. A popular but misguided view holds that ML problems can be solved either by applying some black box algorithm (a.k.a magic), or by hiring interns who are PhD students. In practice, hundreds of decisions and tradeoffs are made in solving such problems, and knowing which decisions to make requires a lot of experience. (I’ll expand upon this more in a future post titled “Machine learning done wrong”.) For a given ML problem, there are tens if not hundreds of way to solve it. Each solution makes different assumptions and it's not obvious how to navigate and identify which assumptions are reasonable and which model should be used. Some would argue: why not just try all different approaches, cross validate, and see which one works the best? In reality, you never have the bandwidth to do so. A strong data scientist might be able to produce a sensible model right off the bat, while a weak one might spend months optimizing the wrong model without knowing where the problem is.
Misconception #3: It's okay to hire data scientists who lack product thinking.
Imagine asking someone who doesn't have a holistic view of the product to optimize your business KPIs. They may prematurely optimize the sign up funnel before making sure the product has reasonable retention, which would lead to more unretained users. Some think data-driven product development is a local optimization. This criticism is only correct when those who drive product development with data fail to think about the product holistically.
To sum up, a productive data science team requires data scientists that are strong in engineering, statistics, and product thinking. It's hard. And it becomes even harder to look for the first data science hire who will be spearheading data efforts in a startup. For startups that don't have the luxury to wait and hire these rare data scientists, it's important to be aware of the compromises made especially in terms of the hiring bar. Before the data team is strong enough across all three areas, make sure they have strong support for the skills they lack, and don't expect them to work autonomously.
If you like the post, you can follow me (@chengtao_chu) on Twitter or subscribe to "ML in the Valley". Also, special thanks Ian Wong (@ihat), Leo Polovets, and Bob Ren (@bobrenjc93) for reading a draft of this.