Phase 1: Digital Logic Fundamentals (2-3 weeks)
Building blocks of computer systems
1Number Systems & Codes
- Binary, octal, decimal, hexadecimal conversions
- Signed number representations (sign-magnitude, 1's complement, 2's complement)
- Binary arithmetic (addition, subtraction, multiplication, division)
- Fixed-point and floating-point representations (IEEE 754)
- Error detection and correction codes (parity, Hamming code, CRC)
2Boolean Algebra & Logic Gates
- Boolean operations and De Morgan's laws
- Canonical forms (SOP, POS, minterms, maxterms)
- Karnaugh maps and logic minimization
- Basic gates (AND, OR, NOT, NAND, NOR, XOR, XNOR)
- Universal gates and gate-level circuit design
3Combinational Circuits
- Multiplexers and demultiplexers
- Encoders and decoders
- Comparators
- Adders (half, full, ripple-carry, carry-lookahead)
- Subtractors and ALU design fundamentals
4Sequential Circuits
- Latches (SR, D, JK, T)
- Flip-flops and timing analysis
- Registers (shift, parallel load, bidirectional)
- Counters (synchronous, asynchronous, up/down, modulo-N)
- State machines (Moore and Mealy models)
Phase 2: Computer Architecture Basics (3-4 weeks)
Understanding processor organization
1Von Neumann & Harvard Architectures
- Stored program concept
- Architectural differences and trade-offs
- Modified Harvard architecture
2CPU Components
- Control Unit (hardwired vs. microprogrammed)
- Arithmetic Logic Unit (ALU)
- Registers (general-purpose, special-purpose)
- Program Counter (PC) and Instruction Register (IR)
- Status/Flag registers
3Instruction Set Architecture (ISA)
- CISC vs. RISC philosophies
- Instruction formats (R-type, I-type, S-type)
- Addressing modes (immediate, direct, indirect, register, indexed)
- Instruction types (data transfer, arithmetic, logical, control flow)
- Assembly language basics
4Data Path Design
- Single-cycle datapath
- Multi-cycle datapath
- Microprogramming
- Control signal generation
Phase 3: Memory Systems (2-3 weeks)
Memory hierarchy and management
1Memory Hierarchy
- Registers, cache, main memory, secondary storage
- Locality principles (temporal and spatial)
- Memory access times and performance metrics
2Cache Memory
- Cache organization (direct-mapped, set-associative, fully-associative)
- Cache mapping functions
- Replacement policies (LRU, FIFO, Random, LFU)
- Write policies (write-through, write-back)
- Cache coherence protocols (MESI, MOESI)
- Multi-level cache hierarchies (L1, L2, L3)
3Virtual Memory
- Paging and page tables
- Translation Lookaside Buffer (TLB)
- Segmentation
- Page replacement algorithms (FIFO, LRU, Optimal, Clock)
- Demand paging and thrashing
- Memory Management Unit (MMU)
4Main Memory Technologies
- SRAM vs. DRAM
- SDRAM, DDR, DDR2, DDR3, DDR4, DDR5
- Memory interleaving
- ECC memory
Phase 4: Pipelining (2-3 weeks)
Instruction pipeline fundamentals
1Pipeline Fundamentals
- Instruction pipeline stages (IF, ID, EX, MEM, WB)
- Pipeline throughput and speedup
- Pipeline latency
- CPI in pipelined systems
2Pipeline Hazards
- Structural hazards: resource conflicts
- Data hazards: RAW, WAR, WAW dependencies
- Control hazards: branch prediction issues
3Hazard Resolution
- Forwarding (bypassing)
- Stalling (pipeline bubbles)
- Branch prediction (static and dynamic)
- Branch delay slots
- Speculative execution
4Advanced Pipelining
- Superpipelining
- Superscalar architectures
- Out-of-order execution
- Register renaming
- Tomasulo's algorithm
- Reorder buffer (ROB)
Phase 5: Instruction-Level Parallelism (2 weeks)
ILP techniques and optimization
1ILP Techniques
- Loop unrolling
- Software pipelining
- Trace scheduling
- VLIW architectures
- Predication and conditional execution
2Branch Prediction
- Static prediction schemes
- Dynamic prediction (1-bit, 2-bit saturating counters)
- Branch History Table (BHT)
- Branch Target Buffer (BTB)
- Two-level adaptive predictors
- Tournament predictors
Phase 6: Parallel Processing (3 weeks)
Multi-core and distributed systems
1Parallel Architecture Models
- Flynn's taxonomy (SISD, SIMD, MISD, MIMD)
- Shared memory vs. distributed memory
- UMA and NUMA architectures
2Multicore Processors
- Symmetric Multiprocessing (SMP)
- Chip Multiprocessing (CMP)
- Simultaneous Multithreading (SMT/Hyper-Threading)
- Thread-level parallelism
3GPU Architecture
- SIMT model
- Streaming multiprocessors
- Warp execution
- Memory hierarchy in GPUs
4Interconnection Networks
- Bus-based systems
- Crossbar switches
- Multistage networks (Omega, Butterfly)
- Mesh and torus topologies
- Network-on-Chip (NoC)
Phase 7: I/O and Storage Systems (2 weeks)
Input/Output and storage technologies
1I/O Organization
- Programmed I/O
- Interrupt-driven I/O
- Direct Memory Access (DMA)
- I/O processors and channels
- Memory-mapped I/O vs. port-mapped I/O
2Storage Technologies
- Magnetic disks (HDD)
- Solid-state drives (SSD)
- RAID levels (0, 1, 5, 6, 10)
- NVMe and PCIe storage
3I/O Performance
- Disk scheduling algorithms (FCFS, SSTF, SCAN, C-SCAN)
- I/O bottlenecks and optimization
Phase 8: Advanced Topics (3-4 weeks)
Power, reliability, and emerging technologies
1Power and Energy Management
- Dynamic voltage and frequency scaling (DVFS)
- Clock gating
- Power gating
- Dark silicon
- Thermal design power (TDP)
2Fault Tolerance and Reliability
- Redundancy techniques
- Checkpointing
- Error detection and correction at architectural level
3Security in Computer Architecture
- Side-channel attacks (Spectre, Meltdown)
- Cache timing attacks
- Hardware security modules
- Trusted execution environments (TEE)
4Quantum Computing Basics
- Qubits and quantum gates
- Quantum vs. classical architecture differences
Major Algorithms, Techniques, and Tools
Essential Computer Architecture Knowledge
1Arithmetic Algorithms
- Booth's multiplication algorithm
- Restoring and non-restoring division
- Wallace tree multiplier
- Carry-lookahead adder algorithm
- Floating-point arithmetic (IEEE 754)
2Cache Algorithms
- LRU (Least Recently Used)
- LFU (Least Frequently Used)
- FIFO (First In First Out)
- Random replacement
- Belady's optimal algorithm
3Page Replacement Algorithms
- FIFO
- LRU and approximations (Clock/Second Chance)
- Working Set algorithm
- Page Fault Frequency (PFF)
4Pipeline Optimization
- Tomasulo's algorithm (dynamic scheduling)
- Scoreboarding
- Register renaming algorithms
5Design Tools
- Logisim/Digital: Logic circuit simulation
- ModelSim/QuestaSim: HDL simulation
- Vivado/Quartus: FPGA design
- gem5: Full-system simulator
- SimpleScalar: Processor simulator
- CACTI: Cache modeling
Cutting-Edge Developments
Modern Hardware Innovations
1Advanced Process Technologies
- 3nm and smaller nodes
- Gate-All-Around (GAA) FETs
- 3 D chip stacking (TSVs)
- Chilet architectures (AMD Zen, Intel Ponte Vecchio)
2Neuromorphic Computing
- Spiking neural networks in hardware
- IBM TrueNorth and Intel Loihi
- Event-driven, brain-inspired architectures
3Heterogeneous Computing
- CPU-GPU integration (AMD APUs, Apple Silicon)
- Domain-specific accelerators (TPUs)
- FPGA integration
4RISC-V Ecosystem
- Open-source ISA gaining adoption
- Custom extensions for specific domains
- SiFive, StarFive implementations
5Security and Reliability
- Post-quantum cryptography accelerators
- Confidential computing (AMD SEV, Intel SGX)
- Hardware-based attestation
- Side-channel attack mitigation
Project Ideas (Beginner to Advanced)
Practical Projects to Apply COA Skills
1Beginner Level
Project 1: Digital Logic Circuits
- Design a 4-bit ALU using Logisim
- Implement basic operations: ADD, SUB, AND, OR, XOR
- Add overflow detection
Project 2: Simple Calculator
- Calculator with basic arithmetic operations
- Use 7-segment displays for output
- Implement using FPGA or simulator
Project 3: Memory Hierarchy Simulator
- Simulate a simple cache (direct-mapped)
- Implement hit/miss detection
- Calculate hit rate for different access patterns
Project 4: Assembly Programming
- Write programs in MIPS/ARM/RISC-V assembly
- Implement sorting algorithms
- Analyze instruction counts and cycles
2Intermediate Level
Project 5: Pipelined Processor Simulator
- Simulate a 5-stage RISC pipeline
- Implement data forwarding
- Handle control hazards with branch prediction
Project 6: Cache Simulator
- Implement direct-mapped, set-associative, fully-associative
- Support LRU, FIFO, and Random replacement
- Analyze performance with real traces
Project 7: Branch Predictor Analysis
- Implement prediction schemes (1-bit, 2-bit, two-level)
- Test with benchmark traces
- Compare accuracy and hardware cost
3Advanced Level
Project 10: Out-of-Order Processor Simulator
- Simulate Tomasulo's algorithm
- Include register renaming and ROB
- Support speculative execution
Project 11: Multicore Cache Coherence
- Simulate multi-core with private L1 caches
- Implement MESI or MOESI protocol
- Test with parallel workloads
4Research-Level
Project 19: ML-Based Hardware Prefetcher
- Design prefetcher that learns access patterns
- Implement using on-chip learning
- Compare with traditional prefetchers