Apache Kafka in Five Minutes
				
				
					What is Apache Kafka?
					
						- Framework for building data pipelines and stream-based applications
- Fault tolerant, resilient
- Very high throughput
- Horizontal scalable
- Integrates well with Big Data frameworks like Apache Flink or Apache Spark
- Apache project ⇒ Apache license (i.e. OS software)
Common use cases
					
						- Messaging systems (e.g. loosed coupled microservices communication)
- Gathering metrics from different locations (e.g. IoT)
- Collecting application logs
- Stream processing / transformation
Components
					 
				
				
					Inside the cluster
					 
				
				
					Logs
					
						- Each partition / replica = transactional log
- Data in log is immutable
- Each message in log gets unique id (offset)
- Offsets are per partition
- Message order guarantee within partition
- Data is temporarily kept (thus messages are replayable)