

© 2017 Dr. Jeffrey A. Turkstra

### Lecture 03

- Basics
- Processors
- Architecture
- ISA
- DMA
- Modes



© 2017 Dr. Jeffrey A. Turkstra

- Some lecture material based on:
  - Slides by Dr. George B. Adams III
  - Slides from Hennessy & Patterson
  - Slides from Silberschatz

(8)

© 2017 Dr. Jeffrey A. Turkstra

#### **Basics**

- Moore's Law for integrated circuits
  - Transistor count for a typical processor or memory chip increases 40% to 55% per year.
  - Doubles every 18-24 months
  - Transistor count ~= computational power
- 1986-2003 computer performance increased ~ 50%/year



© 2017 Dr. Jeffrey A. Turkstra

### ■ 25,000-fold hardware performance improvement since 1985

- Programs today trade execution performance for programmer productivity
- More programming is done in managed languages like Java, Python, and C#
- New applications have arisen: speech, sound, images, video

(1)

© 2017 Dr. Joffroy A. Turketro

#### Moore's Law since 2003

- Microprocessor performance only 20%/year
  - Maximum power dissipation limits for air-cooled chips
  - Lack of additional instruction-level parallelism for hardware to exploit

(E)

#### **Computer components**

- Hardware
  - Transistors
  - Gates
  - Combinational and sequential circuits
  - Adders, decoders, mux/demux, latches, flip-flops, registers
  - Processors
  - Memory
  - etc
- **43**

© 2017 Dr. Jeffrey A. Turkstra

- Data types
  - Representations for character, integer, floating point, etc
    - $\blacksquare$  more, od, xxd
  - Sign-magnitude
  - 1's complement
  - 2's complement
  - IEEE 754
  - BCD
- 94 **=**

© 2017 Dr. Jeffrey A. Turkstra





■ Software

■ Instructions for what to compute

© 2017 Dr. Jeffrey A. Turkstra

# How a modern computer works Thread of execution was data movement data movement data memory CPU (N) Opening and data movement data CMA CPU (N) Opening and data CPU (N) Opening

#### Harvard architecture

- Idea by Howard Aiken, Harvard physicist, to IBM Nov. 1937
- Built by IBM in Endicott, NY and delivered to Harvard in Feb. 1944 as the Mark 1 computer
- Has separate memories for program (instructions) and data
- Input/output (I/O) to connect to the world
- Processor to carry out the computations

(1)



### (John) Von Neumann architecture

- Developed during his June 1945 train ride from Philadelphia to Los Alamos, NM
- He had programmed the Mark 1 in August 1944
- One memory for both data and program
- Same I/O
- Same processor



Von Neumann

computer

processor

input output facilities

Figure 42. Illustration of the Von Neumann architecture. Both programs and data case be stored in the same memory.

#### von Neumann vs Harvard architectures

- von Neumann
  - $\hfill\blacksquare$  Same memory holds instructions and data
  - Single bus between CPU and memory
  - Flexible, more cost effective
- Harvard
- Separate memories for data and instructions
- Two busses
- Allows two simultaneous memory fetches
- Less flexible, memory is physically partitioned
- Both are stored program computer designs



© 2017 Dr. Jeffrey A. Turkstra

#### **Processors**

- Device that performs automatic computation
  - Fixed logic single operation
  - Traffic signal sequencer
  - Selectable logic user can select from multiple hardwired functions
    - Car with Econo and Sports modes for transmission
  - Parameterized logic computes fixed function on variable user input
    - Programmable video recorder
  - Programmable logic processor
    - CPU, GPU, etc

© 2017 Dr. Jeffrey A. T

#### **Stored programs**

- Some memories can be written to only once and then read many times
  - Read-Only Memory (ROM)
  - E.g., automobile engine control
- Some ROM can be re-written
  - PROM, programmable ROM
  - lacktriangle EPROM, erasable programmable ROM
  - EEPROM, electrically erasable...
- Embedded systems often PROMFirmware upgrades

(E)

© 2017 Dr. Jeffrey A. Turkstra

18

#### **Fetch-Execute**

■ At the highest level, a processor does this:
repeat forever {
 FETCH, access the next program instruction from location where it is stored
 EXECUTE, perform the actions described by the instruction
}

#### **Intel Core i7 Processor**

© 2017 Dr. Jeffrey A. Turkstra

# © 2017 Dr. Jeffrey A. Turksiva 21

#### **Motherboard**

© 2017 Dr. Jeffrey A. Turkstra

#### **Architecture basics**

- Instruction set
  - Software instructions that the hardware executes
- Functional organization
  - How is the hardware partitioned into specialized units?

(1)

2017 Dr. Jeffrey A. Turketra

#### **Architecture basics**

- Logic design
  - Which logic circuits are used and how are they organized?
- Implementation
  - $\blacksquare$  Technologies and packaging used



© 2017 Dr. Jeffrey A. Turkstra

24

#### Hierarchical abstraction

- Hardware and software consist of layers in a hierarchy
  - To a good approximation
- Each layer hides (some of) its detail from the layer above
  - Principal of Abstraction
- Highest layer interacts with outside world/end user



© 2017 Dr. Jeffrey A. Turkstra

#### **Instruction set** architecture

- Instruction set architecture (ISA) is a key level of abstraction
  - Primary interface between hardware and software
- Set of operations that a processor performs
- Instruction format defines an interpretation of bit strings
  - Similar to ASCII, 2's complement, IEEE 754, BCD, etc

#### Opcodes, operands, and results

- A bit string, interpreted as an instruction, specifies
  - Operations to be performed
  - Actual operand(s) and/or source(s) for the operand(s) and their type(s)
  - Destination for the result(s)







#### ISA Design

- Many tradeoffs
  - Instruction length
  - Number of registers
  - Number of instructions
  - etc

#### **CISC vs RISC**

- Complex Instruction Set Computer
- Reduced Instruction Set Computer
- RISC won
  - Even Intel uses RISC micro-instructions
    - They just have a really amazing instruction decoder



© 2017 Dr. Jeffrey A. Turkstra

#### **Endianness**

- Imagine memory is read from lowest address to highest address
- Big Endian
  - Most significant, "big," byte comes first. Ie, placed in lowest numbered memory location.
  - $\hfill \blacksquare$  "Big" end appears first when reading memory
  - Network traffic
  - PowerPC, ARM, SPARC, MIPS
- Little Endian
  - Reverse of Big Endian: least significant, "Little," byte placed in lowest address
  - "Little" end first

© 2017 Dr. Jeffrey A. Turkstra

#### Example and comparison

Consider 0x00C0F380 = 0x 00 C0 F3 80 = 0b0000 0000 1100 0000 1111 0011 1000 0000

Most significant byte

Least significant byte

Byte at given location

Addresses arbitrarily start at 0x00000000; Locations accessed in arrow-indicated sequence

 Memory address
 Little endian
 Big endian

 0x00000000
 1000 0000
 0000 0000

 0x00000001
 1111 0000
 1100 0000

 0x00000002
 1100 0000
 1111 0011

 0x00000003
 0000 0000
 1000 0000

© 2017 by George B. Adams III Portions © 2017 Dr. Jeffrey A. Turkstra 0 1000 0000 33



# Memory hierarchy | CPU | Capital |

#### **Registers**

- Type of memory located inside CPU
- Can hold a single piece of data
  - Data processing
  - Control
- Many registers
  - More later







#### **Example**

Writing a program using these instructions is programming in assembly language; example

Assembly instr. ; Comments
load r2, 20(r1); r2 - Data\_Memory[20+r1]
load r3, 24(r1); r3 - Data\_Memory[24+r1]
add r4, r2, r3; r4 - r2 + r3
store r4, 28(r1); Data\_Memory[28+r1] - r4
jump 60(r7); Fetch at Instr.\_Memory[60+r7]
r1, r2, r3, r4 are registers in the data path
20, 24, 28 are decimal constants

20, 24, 28 are decimal constants " $x \leftarrow$ " means "x" is the location for the result Memory[x] means the contents of memory at address x + means addition, with operand type defined by the instruction (r1 + r2 is add with different data type than 28+r3)













# Intel Core microarchitecture pipeline

#### **Direct memory access**

- DMA allows other hardware subsystems to access main memory without going through the CPU
- Modern systems usually have DMA controller (MMU)
  - Memory address register, byte count, control, etc
  - Responsible for ensuring accesses are properly restrained
    - Attack vector

© 2017 Dr. Jeffrey A. Turkstr

#### **MMU**

- Responsible for "refreshing" DRAM
- Translates virtual memory addresses to physical addresses
- Sometimes part of CPU
- Sometimes not
  - Northbridge for Intel until recently
  - I7/i5 have an Integrated Memory Controller (IMC)



## Page 70

#### **Execution modes**

- CPU hardware has several possible modes
  - $\blacksquare$  At any one time, in one mode
- Modes specify
  - Privilege level
  - Valid instructions
  - Valid memory addresses
  - Size of data items
  - Backwards compatibility

© 2017 Dr. Jeffrey A. Turkstra

### Rings

© 2017 Dr. Jeffrey A. Turkstra

#### Ring -1

- Intel Active Management Technology
- Exists for other architectures as well
- Runs on the Intel Management Engine (ME)
  - Isolated and protected coprocessor
  - Embedded in all current Intel chipsets
  - ARC core
  - Out-of-band access
  - Direct access to Ethernet controller
- $\blacksquare \ \, \text{Requires vPro-enabled CPU/Motherboard/Chipset}$



© 2017 Dr. Jeffrey A. Turkstra

#### Ring -1

- ...if you can exploit it, you win.
  - CVE-2017-5689
- Go read about it

(GE)

in 2017 Dr. Jeffrey A. Turketra

#### **Trusting trust**

- Reflections on Trusting Trust
  - $\blacksquare$  by Ken Thompson
- Read this too

(1)

## How to change between modes

- Automatic
  - Hardware interrupts
  - OS-specified handlers
- "Manual"
  - $\blacksquare$  Initiated by software, typically OS
  - System calls, signals, and page faults
  - Sometimes mode can be set by application



© 2017 Dr. Jeffrey A. Turkstra

Paging and Virtual Memory

■ ...later

**(1)** 

© 2017 Dr. Jeffrey A. Turkstra

### **Questions?**

(A)