rlm@74: #LyX 1.6.4 created this file. For more info see http://www.lyx.org/
rlm@74: \lyxformat 345
rlm@74: \begin_document
rlm@74: \begin_header
rlm@74: \textclass article
rlm@74: \use_default_options true
rlm@74: \language english
rlm@74: \inputencoding auto
rlm@74: \font_roman default
rlm@74: \font_sans default
rlm@74: \font_typewriter default
rlm@74: \font_default_family default
rlm@74: \font_sc false
rlm@74: \font_osf false
rlm@74: \font_sf_scale 100
rlm@74: \font_tt_scale 100
rlm@74: 
rlm@74: \graphics default
rlm@74: \paperfontsize default
rlm@74: \use_hyperref false
rlm@74: \papersize default
rlm@74: \use_geometry false
rlm@74: \use_amsmath 1
rlm@74: \use_esint 1
rlm@74: \cite_engine basic
rlm@74: \use_bibtopic false
rlm@74: \paperorientation portrait
rlm@74: \secnumdepth 3
rlm@74: \tocdepth 3
rlm@74: \paragraph_separation indent
rlm@74: \defskip medskip
rlm@74: \quotes_language english
rlm@74: \papercolumns 1
rlm@74: \papersides 1
rlm@74: \paperpagestyle default
rlm@74: \tracking_changes false
rlm@74: \output_changes false
rlm@74: \author "" 
rlm@74: \author "" 
rlm@74: \end_header
rlm@74: 
rlm@74: \begin_body
rlm@74: 
rlm@74: \begin_layout Title
rlm@74: Pygar: Parallel Audio Processing
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Author
rlm@74: Laurel Pardue, Robert McIntyre
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Subsection*
rlm@74: Problem
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: Music naturally comes in parallel sequences of samples called 
rlm@74: \emph on
rlm@74: voices
rlm@74: \emph default
rlm@74:  (ex.
rlm@74:  from multiple instruments).
rlm@74:  Pure-software mixers are forced to pass these voices through the Von Neuman
rlm@74:  bottleneck of a single processor, operating on these streams in series
rlm@74:  and switching between each one quickly.
rlm@74:  They are therefore naturally limited in the number of voices they can handle.
rlm@74:  Worse, since the processing of each voice has to share the same processor,
rlm@74:  too many voices at once can fully max out the processor and crash the system.
rlm@74:  On typical laptop hardware and a high end software tool like ProTools,
rlm@74:  this number is around 5.
rlm@74:  Embedded devices have an even tougher time at meeting any sort of reasonable
rlm@74:  timing requirements.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: [screenie Just 6 voices are enough to bring this session of ProTools to
rlm@74:  it's knees.]
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: We want the power of writing transforms for voices in a high level language
rlm@74:  combined with a framework that applies these transforms to the voices in
rlm@74:  parallel.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Subsection*
rlm@74: Vision --- Pygar
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: Our system addresses the limitations of pure software mixers.
rlm@74:  It is a grid of SMIPS processors capped by a mixer.
rlm@74:  The voices flow through the processors in parallel and are combined at
rlm@74:  the final mixer into a single stream.
rlm@74:  Each processor can be loaded with any arbitrary C program.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: \begin_inset Float figure
rlm@74: placement H
rlm@74: wide false
rlm@74: sideways false
rlm@74: status collapsed
rlm@74: 
rlm@74: \begin_layout Plain Layout
rlm@74: \begin_inset Graphics
rlm@74: 	filename ../../../../Pygar/documents/000402.png
rlm@74: 	width 5in
rlm@74: 
rlm@74: \end_inset
rlm@74: 
rlm@74: 
rlm@74: \begin_inset Caption
rlm@74: 
rlm@74: \begin_layout Plain Layout
rlm@74: The audio data (“samples”) start in the memory, but are soon pulled into
rlm@74:  action by the DMA (direct memory access).
rlm@74:  The DMA sends the samples to a chain of 0 or more soft-cores, where they
rlm@74:  are transformed according to the soft-cores’ algorithms.
rlm@74:  After running the gauntlet of soft-cores, the samples flow first to a buffering
rlm@74:  FIFO, and finally to a mixer, which sends the samples off to be played
rlm@74:  by speakers or stored in a file.
rlm@74:  
rlm@74: \end_layout
rlm@74: 
rlm@74: \end_inset
rlm@74: 
rlm@74: 
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Plain Layout
rlm@74: 
rlm@74: \end_layout
rlm@74: 
rlm@74: \end_inset
rlm@74: 
rlm@74: 
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Subsection*
rlm@74: Steps
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: The difficult part of this project is managing code reuse.
rlm@74:  We need three things for success.
rlm@74:  
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: SMIPS processor -- Easy.
rlm@74:  Just use the Lab 5 processors.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: Some way to program the processors
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: DMA (Direct Memory Access) to load voices into the processors.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: We use ScratchPad to load code into the processors.
rlm@74:  ScratchPad is an Intel module which implements a cache hierarchy.
rlm@74:  The hierarchy reaches all the way from RAM created on the FPGA to on-chip
rlm@74:  DRAM to RAM on the host computer all the way to the Hard Disk of the host
rlm@74:  computer.
rlm@74:  The first time a processor tries to access one of its instructions, the
rlm@74:  cache goes all the way to the hard disk of the host computer to retrieve
rlm@74:  the data.
rlm@74:  Subsequent attempts to access this data only go as far as the on-chip DRAM.
rlm@74:  Each processor has its own ScratchPad and thus can be programmed independently.
rlm@74:  The ScratchPad abstraction allows each processor to run a program of any
rlm@74:  size.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: Music access is achieved through RRR, another Intel abstraction which allows
rlm@74:  us to treat the hard disk of the host computer as if it were a normal FIFO.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Subsection*
rlm@74: News
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Standard
rlm@74: We have run our system with 12 sample voices and various combinations of
rlm@74:  simple c voice processing programs and the results have been better than
rlm@74:  software implementations.
rlm@74:  Significantly, increasing the number of voices does not increase the processing
rlm@74:  load since each voice is processed in parallel.
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Subsection*
rlm@74: Contributions
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: Implemented Pygar, a system for quick parallel processing of audio.
rlm@74:  
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: Implemented 4 basic algorithms which serve as components for this system
rlm@74:  (identity, bit-shift, volume-change, and delay) 
rlm@74: \end_layout
rlm@74: 
rlm@74: \begin_layout Itemize
rlm@74: Demonstrated Pygar out-performs software-only systems.
rlm@74:  Pure-software systems have a limit of around 6 voices, while our system
rlm@74:  achieves 12 voices in parallel with no architecturally imposed limit on
rlm@74:  the number of voices.
rlm@74: \end_layout
rlm@74: 
rlm@74: \end_body
rlm@74: \end_document