Sunday 28 July 2019

How much time will it take to create Quora?

If you really want to build a product that scales to 100 million users, it will be a real big gun and I am sure you will face many challenges both from a technical and non-technical point of view.
Technical challenges:
  • Advanced machine learning algorithms
  • Scaling
Non-technical challenges:
  • Acquiring customers
  • Keeping the product in line with what customers need
These challenges may sound trivial, but as a tech guy, I can say with confidence that scaling to 100 million people is definitely not a piece of cake. When you have a hundred thousand users, you are all good. When a hundred million users are using your app, things start breaking and you really need some groundbreaking technology to handle this volume of traffic.
However, if you are looking for something simple - a Quora like question and answer website, then that’s really simple and I have done it (with the help of a couple of friends). With some effort, anyone can do it.
Here are the screenshots:
The technology behind it:
The database schema is described below:
  • UserProfile table:
  1. user = models.OneToOneField(User, primary_key=True)
  2. avatar = models.ImageField(null=True, upload_to=generate_filename, default="../default_avatar.png")
  3. bio = models.CharField(max_length=50, null=True)
  4. followers = models.ManyToManyField(User, related_name='following')
  5. following = models.ManyToManyField(User, related_name='followers')
  • Topic table:
  1. name = models.CharField(max_length=50)
  2. url = models.CharField(max_length=100, primary_key=True)
  3. followers = models.ManyToManyField(User, related_name='topic_followers')
  • Questions table:
  1. text = models.CharField(max_length=100)
  2. time = models.DateTimeField(default=timezone.now)
  3. asked_by = models.ForeignKey(UserProfile, on_delete=models.SET_NULL,
  4. null=True, db_index=True)
  5. url = models.CharField(max_length=100, primary_key=True)
  6. details = models.CharField(max_length=200)
  7. topics = models.ManyToManyField(Topic, related_name='topic_questions')
  8. followers = models.ManyToManyField(User, related_name='question_followers')
  • Answers table (yo is the term I used for upvote :P):
  1. question_url = models.ForeignKey(Question, on_delete=models.CASCADE)
  2. answered_by = models.ForeignKey(
  3. UserProfile, db_index=True, on_delete=models.SET_NULL, null=True)
  4. question_text = models.CharField(max_length=100)
  5. text = models.TextField()
  6. time = models.DateTimeField(default=timezone.now)
  7. yoers = models.ManyToManyField(User, related_name='yoers')
  • Comments table (answers have comments):
  1. text = models.CharField(max_length=200)
  2. time = models.DateTimeField(default=timezone.now)
  3. commented_by = models.ForeignKey(
  4. User, on_delete=models.SET_NULL, null=True)
  5. answer = models.ForeignKey(Answer, on_delete=models.CASCADE, db_index=True, related_name='comments')
An interesting piece remains - newsfeed algorithm:
As can be seen, the newsfeed has 3 parts:
  • Latest Questions/Answers:
  1. user = request.user
  2. latest_questions = Question.objects.all().order_by('-time')[:FEED_COUNT]
  3. latest_answers = Answer.objects.all().order_by(
  4. '-time')[:FEED_COUNT].prefetch_related('question_url')
  5. latest_qa = list(latest_questions) + list(latest_answers)
  6. latest_qa.sort(key=lambda x: x.time, reverse=True)
  7.  
  8. yo_list, yo_count_list = utils_get_yo_info(latest_qa, user)
  9. latest_qa_with_yos = zip(latest_qa, yo_list, yo_count_list)
  10.  
  11. return render(request, 'home/latestqa.html', {
  12. 'latest_qa_with_yos': latest_qa_with_yos,
  13. 'user': user,
  14. 'domain': settings.DOMAIN_NAME})
Here is a brief explanation of the code above:
  1. Generate a list of latest 20 or so questions
  2. Generate a list of latest 20 or so answers
  3. Sort them in the reverse order of time after combining the 2 lists
  4. Show them as the latest Q/A newsfeed
  • Topics you like:
  1. topics = request.user.topic_followers.all()
  2. topic_questions = list(set(Question.objects.filter(
  3. topics__in=topics).order_by('-time')[:FEED_COUNT]))
  4. topic_questions.sort(key=lambda x: x.time, reverse=True)
  5.  
  6. yo_list, yo_count_list = utils_get_yo_info(topic_questions, request.user)
  7. topic_questions_with_yos = zip(topic_questions, yo_list, yo_count_list)
  8.  
  9. return render(request, 'home/topicsyoulike.html',
  10. {'topic_questions_with_yos': topic_questions_with_yos})
This is quite straightforward - a simple database query + some manipulations
  • People you follow:
  1. following = request.user.following.all()
  2. answers = list(set(Answer.objects.filter(
  3. answered_by__in=following).order_by('-time')[:FEED_COUNT]))
  4. answers.sort(key=lambda x: x.time, reverse=True)
  5.  
  6. yo_list, yo_count_list = utils_get_yo_info(answers, request.user)
  7. answers_with_yos = zip(answers, yo_list, yo_count_list)
  8.  
  9. return render(request, 'home/peopleyoufollow.html',
  10. {'answers_with_yos': answers_with_yos})
Again, this is also quite straightforward to understand.
All of these newsfeed ‘algorithms’ were written by me and are in a very raw form. None of them is really ‘intelligent’ and I am sure many of the database queries can be optimized significantly. But again, this wasn’t supposed to be Quora, it was rather supposed to be Quora-like.
Note that the site has been abandoned. It was just an academic project (no commercialization) and the questions and answers you see in the screenshots are because it was tested by my friends a couple of days back. It took a few weeks to complete the web-app.
The name SoclWebApp was randomly chosen (Socl == (Soc)ialize + (L)earn). The site isn’t indexed by search engines and you won’t be able to find it on Google/elsewhere. However, I would be more than happy to answer any questions :)

4 comments:

  1. Once I made a php and sql based facebook site myself it had most of functions like messaging, login, friend requests, status update,notifications , media etc. Inspiration was the movie 'The Social Network'. Its ui looked like the facebook of 2005. But one thing is sure these projects require another level of determination and patience.

    ReplyDelete
    Replies
    1. Can you explain in detail how to proceed@shrish

      Delete
  2. @Anshuman In short. Start a Localhost in WAMP or something, Learn sql and php, handling Requests, encryption and client side development using javascript and jquery, Forms and Validation. All this mini version requires. This article is itself is a good start. but Remember scaling a product to millions of users is not possible with this. You'll explore many things in this project. If you stuck anywhere lookup on the web.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete