rlm@74: #LyX 1.6.4 created this file. For more info see http://www.lyx.org/ rlm@74: \lyxformat 345 rlm@74: \begin_document rlm@74: \begin_header rlm@74: \textclass article rlm@74: \use_default_options true rlm@74: \language english rlm@74: \inputencoding auto rlm@74: \font_roman default rlm@74: \font_sans default rlm@74: \font_typewriter default rlm@74: \font_default_family default rlm@74: \font_sc false rlm@74: \font_osf false rlm@74: \font_sf_scale 100 rlm@74: \font_tt_scale 100 rlm@74: rlm@74: \graphics default rlm@74: \paperfontsize default rlm@74: \use_hyperref false rlm@74: \papersize default rlm@74: \use_geometry false rlm@74: \use_amsmath 1 rlm@74: \use_esint 1 rlm@74: \cite_engine basic rlm@74: \use_bibtopic false rlm@74: \paperorientation portrait rlm@74: \secnumdepth 3 rlm@74: \tocdepth 3 rlm@74: \paragraph_separation indent rlm@74: \defskip medskip rlm@74: \quotes_language english rlm@74: \papercolumns 1 rlm@74: \papersides 1 rlm@74: \paperpagestyle default rlm@74: \tracking_changes false rlm@74: \output_changes false rlm@74: \author "" rlm@74: \author "" rlm@74: \end_header rlm@74: rlm@74: \begin_body rlm@74: rlm@74: \begin_layout Title rlm@74: Pygar: Parallel Audio Processing rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Author rlm@74: Laurel Pardue, Robert McIntyre rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Subsection* rlm@74: Problem rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: Music naturally comes in parallel sequences of samples called rlm@74: \emph on rlm@74: voices rlm@74: \emph default rlm@74: (ex. rlm@74: from multiple instruments). rlm@74: Pure-software mixers are forced to pass these voices through the Von Neuman rlm@74: bottleneck of a single processor, operating on these streams in series rlm@74: and switching between each one quickly. rlm@74: They are therefore naturally limited in the number of voices they can handle. rlm@74: Worse, since the processing of each voice has to share the same processor, rlm@74: too many voices at once can fully max out the processor and crash the system. rlm@74: On typical laptop hardware and a high end software tool like ProTools, rlm@74: this number is around 5. rlm@74: Embedded devices have an even tougher time at meeting any sort of reasonable rlm@74: timing requirements. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: [screenie Just 6 voices are enough to bring this session of ProTools to rlm@74: it's knees.] rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: We want the power of writing transforms for voices in a high level language rlm@74: combined with a framework that applies these transforms to the voices in rlm@74: parallel. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Subsection* rlm@74: Vision --- Pygar rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: Our system addresses the limitations of pure software mixers. rlm@74: It is a grid of SMIPS processors capped by a mixer. rlm@74: The voices flow through the processors in parallel and are combined at rlm@74: the final mixer into a single stream. rlm@74: Each processor can be loaded with any arbitrary C program. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: \begin_inset Float figure rlm@74: placement H rlm@74: wide false rlm@74: sideways false rlm@74: status collapsed rlm@74: rlm@74: \begin_layout Plain Layout rlm@74: \begin_inset Graphics rlm@74: filename ../../../../Pygar/documents/000402.png rlm@74: width 5in rlm@74: rlm@74: \end_inset rlm@74: rlm@74: rlm@74: \begin_inset Caption rlm@74: rlm@74: \begin_layout Plain Layout rlm@74: The audio data (“samples”) start in the memory, but are soon pulled into rlm@74: action by the DMA (direct memory access). rlm@74: The DMA sends the samples to a chain of 0 or more soft-cores, where they rlm@74: are transformed according to the soft-cores’ algorithms. rlm@74: After running the gauntlet of soft-cores, the samples flow first to a buffering rlm@74: FIFO, and finally to a mixer, which sends the samples off to be played rlm@74: by speakers or stored in a file. rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \end_inset rlm@74: rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Plain Layout rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \end_inset rlm@74: rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Subsection* rlm@74: Steps rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: The difficult part of this project is managing code reuse. rlm@74: We need three things for success. rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: SMIPS processor -- Easy. rlm@74: Just use the Lab 5 processors. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: Some way to program the processors rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: DMA (Direct Memory Access) to load voices into the processors. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: We use ScratchPad to load code into the processors. rlm@74: ScratchPad is an Intel module which implements a cache hierarchy. rlm@74: The hierarchy reaches all the way from RAM created on the FPGA to on-chip rlm@74: DRAM to RAM on the host computer all the way to the Hard Disk of the host rlm@74: computer. rlm@74: The first time a processor tries to access one of its instructions, the rlm@74: cache goes all the way to the hard disk of the host computer to retrieve rlm@74: the data. rlm@74: Subsequent attempts to access this data only go as far as the on-chip DRAM. rlm@74: Each processor has its own ScratchPad and thus can be programmed independently. rlm@74: The ScratchPad abstraction allows each processor to run a program of any rlm@74: size. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: Music access is achieved through RRR, another Intel abstraction which allows rlm@74: us to treat the hard disk of the host computer as if it were a normal FIFO. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Subsection* rlm@74: News rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Standard rlm@74: We have run our system with 12 sample voices and various combinations of rlm@74: simple c voice processing programs and the results have been better than rlm@74: software implementations. rlm@74: Significantly, increasing the number of voices does not increase the processing rlm@74: load since each voice is processed in parallel. rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Subsection* rlm@74: Contributions rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: Implemented Pygar, a system for quick parallel processing of audio. rlm@74: rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: Implemented 4 basic algorithms which serve as components for this system rlm@74: (identity, bit-shift, volume-change, and delay) rlm@74: \end_layout rlm@74: rlm@74: \begin_layout Itemize rlm@74: Demonstrated Pygar out-performs software-only systems. rlm@74: Pure-software systems have a limit of around 6 voices, while our system rlm@74: achieves 12 voices in parallel with no architecturally imposed limit on rlm@74: the number of voices. rlm@74: \end_layout rlm@74: rlm@74: \end_body rlm@74: \end_document