Capstone Project & Debugging Exercises
📖 Concept
The capstone ties everything together. Implement a real-world system that uses MULTIPLE data structures and algorithms working together.
CAPSTONE PROJECT: Build a Mini Search Engine
Requirements:
- Build an inverted index from a set of documents (text files or strings)
- Support multi-word search with AND/OR operations
- Rank results by TF-IDF relevance score
- Provide autocomplete suggestions as user types
- Cache recent searches with LRU cache
- Handle edge cases: empty queries, special characters, case insensitivity
Data structures used:
- Hash Map: inverted index (word → document list)
- Trie: autocomplete suggestions
- LRU Cache: recent search results
- Min-Heap: top-K results by relevance
- Array: document storage
- Set: deduplication
Debugging Exercises (test your understanding): These exercises contain INTENTIONAL bugs. Find and fix them.
- Off-by-one binary search — infinite loop on certain inputs
- Stack overflow in tree traversal — missing base case
- LRU cache memory leak — not deleting from linked list
- Incorrect topological sort — wrong cycle detection
- Hash collision causing wrong answers — using object as key
🏠 Real-world analogy: The capstone is like a final exam where you use EVERYTHING you've learned — not one concept in isolation, but combining data structures and algorithms to build a working system.
💻 Code Example
1// CAPSTONE: MINI SEARCH ENGINE2class MiniSearchEngine {3 constructor() {4 this.documents = new Map(); // docId → text5 this.invertedIndex = new Map(); // word → Set<docId>6 this.autocompleteTrie = { children: {} };7 this.searchCache = new LRUSearchCache(100);8 this.docCount = 0;9 }1011 // Add a document to the engine12 addDocument(text) {13 const docId = this.docCount++;14 this.documents.set(docId, text);1516 const words = this.tokenize(text);17 const wordSet = new Set(words);1819 for (const word of wordSet) {20 if (!this.invertedIndex.has(word)) this.invertedIndex.set(word, new Set());21 this.invertedIndex.get(word).add(docId);22 this.insertTrie(word);23 }24 }2526 // Search for documents matching query27 search(query) {28 // Check cache first29 const cached = this.searchCache.get(query);30 if (cached) return cached;3132 const words = this.tokenize(query);33 if (words.length === 0) return [];3435 // AND search — intersect document sets36 let resultDocs = null;37 for (const word of words) {38 const docs = this.invertedIndex.get(word) || new Set();39 resultDocs = resultDocs40 ? new Set([...resultDocs].filter(d => docs.has(d)))41 : new Set(docs);42 }4344 // Rank by TF-IDF45 const ranked = this.rankResults([...resultDocs], words);4647 // Cache result48 this.searchCache.put(query, ranked);49 return ranked;50 }5152 // Autocomplete suggestions53 suggest(prefix) {54 let node = this.autocompleteTrie;55 for (const c of prefix.toLowerCase()) {56 if (!node.children[c]) return [];57 node = node.children[c];58 }59 const results = [];60 this.dfs(node, prefix.toLowerCase(), results);61 return results.slice(0, 5);62 }6364 // Internal helpers65 tokenize(text) {66 return text.toLowerCase().replace(/[^a-z0-9 ]/g, '').split(/\s+/).filter(w => w.length > 0);67 }6869 insertTrie(word) {70 let node = this.autocompleteTrie;71 for (const c of word) {72 if (!node.children[c]) node.children[c] = { children: {}, count: 0 };73 node = node.children[c];74 }75 node.isEnd = true;76 node.count = (node.count || 0) + 1;77 }7879 dfs(node, path, results) {80 if (node.isEnd) results.push(path);81 for (const [c, child] of Object.entries(node.children)) {82 if (c !== 'isEnd' && c !== 'count') this.dfs(child, path + c, results);83 }84 }8586 rankResults(docIds, queryWords) {87 return docIds.map(docId => ({88 docId,89 score: this.computeTFIDF(docId, queryWords),90 preview: this.documents.get(docId).substring(0, 100)91 })).sort((a, b) => b.score - a.score);92 }9394 computeTFIDF(docId, queryWords) {95 const text = this.documents.get(docId);96 const words = this.tokenize(text);97 let score = 0;98 for (const qw of queryWords) {99 const tf = words.filter(w => w === qw).length / words.length;100 const df = (this.invertedIndex.get(qw)?.size || 0);101 const idf = Math.log((this.docCount + 1) / (df + 1));102 score += tf * idf;103 }104 return score;105 }106}107108class LRUSearchCache {109 constructor(cap) { this.cap = cap; this.map = new Map(); }110 get(key) { if (!this.map.has(key)) return null; const v = this.map.get(key); this.map.delete(key); this.map.set(key, v); return v; }111 put(key, val) { if (this.map.has(key)) this.map.delete(key); this.map.set(key, val); if (this.map.size > this.cap) this.map.delete(this.map.keys().next().value); }112}113114// DEBUGGING EXERCISE #1: Find the bug115function binarySearchBuggy(arr, target) {116 let lo = 0, hi = arr.length; // BUG: should be arr.length - 1117 while (lo <= hi) {118 const mid = Math.floor((lo + hi) / 2);119 if (arr[mid] === target) return mid;120 if (arr[mid] < target) lo = mid + 1;121 else hi = mid; // BUG: should be mid - 1 (causes infinite loop)122 }123 return -1;124}
🏋️ Practice Exercise
Capstone Project Steps:
- Implement the MiniSearchEngine class
- Add 10+ documents (paragraphs of text)
- Test search with single and multi-word queries
- Test autocomplete with various prefixes
- Verify LRU cache is working (search twice, second should be cached)
- Add support for OR search (union of document sets)
- Add stop word filtering (skip 'the', 'a', 'is', etc.)
- Measure performance: how fast is search with 1000 documents?
- Add phrase search: "exact phrase" finds documents with exact sequence
- Write unit tests for each component
Debugging Challenges:
- Fix the binary search bug above — what are the two issues?
- This DFS has a bug:
if (!root) return; visit(root); dfs(root.left);— what's missing? - This LRU cache doesn't evict properly — why?
- This topological sort gives wrong order — the graph has a cycle but returns a result
- This hash map uses objects as keys — why does lookup fail?
⚠️ Common Mistakes
Building the capstone in one giant function — modularize into separate classes
Not testing each component individually before integration
Skipping the debugging exercises — finding bugs is a crucial interview skill
Not measuring performance — understanding real-world speed is important
Over-engineering the capstone — start simple, add features incrementally
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Capstone Project & Debugging Exercises