FREE ELECTRONIC LIBRARY - Online materials, documents

Pages:   || 2 | 3 | 4 | 5 |

«Automated Design of Finite State Machine Predictors Timothy Sherwood Brad Calder Department of Computer Science and Engineering University of ...»

-- [ Page 1 ] --

Automated Design of Finite State Machine Predictors

Timothy Sherwood Brad Calder

Department of Computer Science and Engineering

University of California, San Diego


Index Terms: Optimizing State Machines, Branch Predicition, Value Prediction, Confidence Estimation


Finite State Machines (FSM) are a fundamental building block in computer architecture, and are used to

control and optimize all types of prediction and speculation, now even in the embedded space. They are used for branch prediction, cache replacement policies, and confidence estimation and accuracy counters for a variety of optimizations.

In this paper, we present a framework for automated design of small FSM predictors for general purpose and customized processors. Our approach can be used to automatically generate small FSM predictors to perform well over a suite of applications for a general purpose processor, or tailored to a specific application, or even a specific instruction for a customized processor. We evaluate the performance of automatically generating FSM predictors for branch prediction and confidence estimation for value prediction.

1 Introduction An important part of creating any processor is the workload that was used to guide the design and the testing of the processor. Even general purpose processors are customized to a suite of common applications, such as the SPEC2000 and other benchmark suites. The general purpose architecture structures and predictors used in the processor were customized to achieve the best average performance over the design workload.

Customized embedded processors use compiler analysis and design automation techniques to take a generalized architectural model and create a specific instance of it optimized to a given application or very small set of applications. This process can range from whole processor synthesis to having a general processor core where a few specific architecture components are customized. Typically the design relies upon optimizing and combining well understood lower level primitives.

A common structure in both general purpose and custom processors is the finite state machine predictor, which is most commonly used as a branch prediction counter or for confidence estimation. The more established uses of these finite state machine predictors (e.g., branch prediction) are now finding their way into embedded processors and DSPs [21]. For general purpose processors, confidence estimation has been shown to significantly improve the performance of value prediction, address prediction, memory disambiguation, thread speculation, and prefetching in modern high performance processors.

A Finite State Machine (FSM) consists of a set of states, a start state, an input alphabet, and a transition function that maps an input symbol and current state to next state. A Moore machine extends this with an output on each state. If one considers the case where the alphabet and output symbols are constrained to be only {0, 1}, a finite state machine can be used to generate yes/no predictions. The output at a given state is its prediction of the next input. Either intentionally or accidentally, many FSMs in computer architecture are of this form, with the most famous of these being the two-bit saturating counter. In a branch predictor, these simple state machines generate predictions of taken after a series of takens, or not-taken after seeing a series of not-takens.

In this paper we present an automated approach for creating finite state machine predictors. We use profiles to generate a language of predictions of a certain type, which is expressed as a regular expression. The regular expression is then converted into a FSM. The VHDL for synthesis is then generated from the FSM. Using this approach we can automatically generate FSM predictors that perform well over a suite of applications, tailored to a specific application, and even a specific instruction. We demonstrate the ability to automatically generate accurate FSM predictors tailored to specific branches for branch prediction. We then examine using our approach to generate accurate confidence estimation for value prediction to be used for a general purpose processor, so the FSM is customized to achieve the best average performance over the whole suite of applications examined.

Section 3 describes current FSM predictors used in computer architecture, and prior work into automatically generating predictors. Section 4 details our approach to automatically generate FSM predictors. Section 7 uses our automated approach to generate a custom branch prediction architecture, presents miss rates for our custom architecture, and examines in detail some of the FSM predictors generated by our approach. Section 2 describes other potential uses for FSM predictor customization and initial results for using our approach for automatic generation of confidence estimators for value prediction. Finally, section 8 provides a summary of our research.

2 Uses of Finite State Machine Predictors In this section we describe how FSM predictors are used in a few areas of computer architecture, to motivate the use of our approach.

2.1 Value, Address, and Load Speculation Confidence counters have been used to guide many speculative execution architectures. These include value prediction, address prediction, instruction reuse, memoization, load speculation, and memory renaming [4]. In all of these architectures, speculation occurs to hide latency caused by a real or potential dependencies. When a misprediction occurs, the penalty can be costly. Therefore, confidence estimation counters are used to guide when to use the speculative optimization. In Section 6, we examine automatically generating FSM confidence counters to guide value prediction for a general purpose processor.

2.2 Branch Prediction A branch predictor uses FSM counters to provide conditional branch prediction, with the 2-bit counter being the most wildly known FSM predictor [40, 49]. In designing a branch predictor for a processor, our approach can be used to generated FSM branch prediction counters customized to a whole workload for a general purpose processor, or for a specific application or individual branches for a customized processor. In Section 7, we examine the performance of generating FSM predictors for specific branches in the context of building a customized processor.

2.3 Speculative Threaded Execution There have been several recent studies into speculative threaded execution in order to obtain more parallelism.

These range from executing both paths of a branch [17, 46, 47, 20] to speculatively generating threads to run ahead of the main thread of execution [42, 44]. These architecture designs use FSM predictors to predict when to spawn speculative threads or when to execute down additional paths.

2.4 Cache Management Cache management schemes have been proposed that perform intelligent replacement [25], cache exclusion [45], and they use a small FSM counter to determine when the optimization should be applied. In addition, prefetching architectures have used FSM predictors to determine when to initiate prefetching for a load and to guide stream buffer allocation [39].

2.5 Power Control Manne et. al. [24] examined using confidence estimation to find branches that had a high miss rate, and then for those branches, stall the fetch unit until the branch direction is resolved. This can save a significant amount of power for branches that have a high miss rate, and is beneficial for low power processors. Musoll [28] proposed using hardware predictors to predict whether an access to certain hardware blocks can be avoided in order to save power.

3 Prior Work In this section we summarize current work on FSM predictors and prior work into automatically finding or generating predictors.

3.1 FSM Predictor Implementations The majority of FSM predictors used in prior research are Saturating Up and Down (SUD) counters. Four values define a SUD counter – (saturation threshold, correct increment, wrong decrement, and a prediction threshold).

A SUD counter can have a value between 0 and the saturation threshold. If the prediction is correct, the counter is incremented by the correct increment value. If it is incorrect, the counter is decreased by the wrong decrement value.

The majority of current processors use a branch predictor that indexes into a table of 2-bit SUD counters [40, 49]. The counter is incremented when the branch is taken, and decremented with not-taken, with a saturating threshold of 3. When the counter has a value less than or equal to 1, the branch is predicted as not-taken. If the counter has a value greater than 1 then it is predicted as taken. Some of these branch prediction architectures have several prediction tables, and a Meta table of SUD counters are used to pick which predictor to use.

Jacobsen et. al. [19] proposed the idea of confidence estimation and examined its relationship to branch prediction. In their study, they used saturating up and down (SUD) counters, and Resetting Counters to provide the confidence estimation. A resetting counter resets the counter back to 0 when there is a misprediction. Lick et. al. [46] used profiling to search for branch history patterns that provided correct predictions and that also had a high degree of confidence. They then examined a confidence estimator that predicted high confidence when these patterns were seen, and low confidence when the other patterns occurred. Grunwald et. al. [16] presented several new metrics for evaluating confidence estimators and provided a detail comparison of using SUD counters, resetting counters, and static confidence estimation.

Our focus in this research [36] is an automated approach for generating small FSMs. The outcome of each state of our FSM counter is a binary decision of yes or no. This captures most of the implementations of FSMs, but not all. Confidence estimators can return a number representing a probability instead of a binary decision, and several different actions could be performed based upon the degree of confidence returned. Even so, most confidence estimation implementations have only one prediction threshold for the SUD counter used, and therefore fits into the space of FSM counters our approach addresses.

3.2 Automatically Generating Predictors Burtscher and Zorn [2] examined using profiles to find N-bit value prediction histories that were highly confident.

For each possible history, their profile gave what the prediction probability would be if values were predicted using that history. They then used this profile, along with a desired accuracy they wanted to achieve, to select which histories would be used to generate value predictions, and which histories would predict low confidence.

This was then used to guide confidence estimation for their value prediction architecture.

Chen et. al. [6] examined using techniques from data compression to improve the performance of branch prediction. They looked at using Prediction by Partial Matching (PPM), where there are M tables from size 2 to 2M. Each PPM entry contains a frequency for the number of times the next bit was 0 (not-taken) and the number of times it was (1) taken. All of the PPM tables are then searched in parallel for each history length. The PPM table entry that had the highest probability was then used for the prediction.

Emer and Gloy [9] have the closest prior work to ours where they examine using genetic programming with feedback to search the predictor design space. They developed a language to describe valid branch prediction architectures, which consists of a variety of predictor primitives (e.g., counters, tables, values), their sizes, and functions to combine these primitives. Using genetic programming techniques, they search for new predictors by performing crossovers and mutating recent candidates, and they evaluate each predictors potential by examining how well it performs for a given set of benchmarks. In contrast, our approach automatically builds FSM predictors from behavioral traces, without searching. Our approach can generate FSM predictors that cannot be represented easily or naturally by the branch prediction language defined by Emer and Gloy. Conversely, our approach does not generate or examine the design space of table based prediction architectures. Therefore, our approach is better for finding efficient predictors for small design areas, whereas their approach can potentially find better solutions for designs with larger areas, and it may be beneficial to combine the two approaches.

4 Automated Design of a FSM Predictor To design a FSM predictor we go through a fairly complex design flow starting with profile information and finishing with synthesizable VHDL code. While the techniques described here can be used for any sort of predictor, we describe building FSM predictors for branch prediction throughout this section to help explain our approach.

For clarity use a more general notation of 1 and 0 instead of taken and not taken respectively. We start off with a high level overview of the design chain we have developed and then discuss each step in detail with an example.

4.1 Overview Regular expressions provide us with a way to specify patterns and then have them converted into finite state machines because they are both related by formal language. A formal language is a set of strings, either finite or infinite, made up of a sequence of elements from an alphabet, in our case the set of zeros and ones.

A regular expression provides a recursive way of testing an input string to see if it belongs to a given language.

The alternative is to define an iterative way of testing a string, which is what a finite state machine does. The two are related because they both perform the same function, recognizing strings, and a mapping from one to the other can always be found.

We now show a way of exploiting this relationship for the purpose of creating a predictor. Let us say that si = {b1, b2,..., bi }, which means si is the string formed by the concatenation of all the input sequences from the start until i. Now let s = {s1, s2,..., si }. This is the set of all possible inputs the predictor could see over time. If we pick a subset L, of s, where L is the set of input strings that satisfy some metric we have chosen, we can say that L is the language of predict 1.

Pages:   || 2 | 3 | 4 | 5 |

Similar works:

«This 22-page story was anthologized in the 1922 book THE CLICKING OF CUTHBERT. It is in the public domain. The Clicking Of Cuthbert by P.G. Wodehouse The young man came into the smoking-room of the clubhouse, and flung his bag with a clatter on the floor. He sank moodily into an arm-chair and pressed the bell. Waiter! Sir? The young man pointed at the bag with every evidence of distaste. You may have these clubs, he said. Take them away. If you don't want them yourself, give them to one of the...»

«Abhishek Sarkar Jadavpur University THOMAS DEKKER AND THE SPECTRE OF UNDERWORLD JARGON Abstract My paper seeks to locate Thomas Dekker’s handling of underworld jargon at the interface of oral and literary cultures. The paper briefly looks at a play co-authored by Dekker and then examines two ‘‘coney-catching pamphlets” by him to see how he tries to appropriate cant or criminal lingo (necessarily an oral system) as an aesthetic/commercial programme. In these two tracts (namely, The...»

«Hole in the Wall: Informed Short Selling Ahead of Private Placements Henk Berkman, Michael D. McKenzie and Patrick Verwijmeren April 2014 Abstract Companies planning a private placement typically gauge the interest of potential buyers before the offering is publicly announced. Regulators are concerned with this practice, called wallcrossing, as it might invite insider trading, especially when the potential investors are hedge funds. We examine privately placed common stock and convertible...»

«Tips for Avoiding a Predatory Mortgage Loan What is Predatory Mortgage Lending? A predatory mortgage is a needlessly expensive home loan that provides no financial benefit to the borrower in return for the extra costs. In many cases, homeowners are deceived about the loan's true costs and terms or are pressured into signing loans they cannot afford. Many of these homeowners lose their homes to foreclosure. If you're in the market for a home loan, here are some questions yOll should ask and...»

«SMARTFINDEXPRESS FREQUENTLY ASKED QUESTIONS Welcome to SmartFindExpress (SFE) for Mesa Public Schools. Your service for the district is valuable and appreciated. If you have any questions regarding the information provided, please call the Sub Office. Frequently Asked Questions Page # Accessing SFE 2 Access ID 2 Available Jobs Over the Phone 2 Available Jobs Online 2 Call Times 2 Canceling 3 Declines 3 Directions 3 Do Not Disturb Feature 4 Do Not Use 4 E-Mail Notification 4 Filling Jobs 4 Job...»

«Hoover, Jon (2009) Islamic universalism: Ibn Qayyim alJawziyya's Salafī deliberations on the duration of HellFire. Muslim World, 99 (1). pp. 181-201. ISSN 0027Access from the University of Nottingham repository: http://eprints.nottingham.ac.uk/2385/1/ibn_al_Qayyim.pdf Copyright and reuse: The Nottingham ePrints service makes this work by researchers of the University of Nottingham available open access under the following conditions. This article is made available under the Creative Commons...»

«Current Information The AmericAn School in englAnd 2012 – 2013 CONTENTS miSSion STATemenT AdminiSTrATion FAculTy AcAdemic inFormATion: Lower School (Nursery Grade 5) Middle School (Grades 6 8) Course Offerings Upper School (Grades 9 12) Graduation Requirements Advanced Placement International Baccalaureate Course Offerings engliSh-AS-A-Second lAnguAge ProgrAm SPeciAl needS Policy And leArning reSource ProgrAm counSeling ServiceS exTrA curriculAr And SPorTS/AcTiviTieS TrAvel And excurSionS...»

«These Mountains are our Sacred Places: A Tale of Two Cities Gregory LowanTrudeau Assistant Professor Werklund School of Education University of Calgary, Canada gelowan@ucalgary.ca Submitted to Drs. M. Robertson, R. Lawrence & G. Heath (Eds.) Experiencing the Outdoors: Enhancing Strategies for Wellbeing Sense Publishers May 2014 These Mountains These Mountains are our Sacred Places: A Tale of Two Cities Snow crunches beneath my feet as I reach the top of the...»

«B R E C H T ON ART A N D P O L I T I C S The Piscator Experiment The Piscator Experiment Apart from EngePs crucially important production of Coriolanus, experiments in epic theatre were undertaken in dramatic writing only. (The first drama to construct this epic theatre was Brecht's dramatic biography Baal, the simplest was Emil Burri's American Youthy and the most accomplished so far because it comes from an author with a quite different perspective was Bronnen's East Pole Train.) Now theatre,...»

«Research Brief Multiple Systems Estimation for estimating the number of victims of human trafficking across the world Based on a paper written by Jan van Dijk and Peter G. M. van der Heijden for UNODC1 Sustainable Development Indicator 16.2.2: “Number of victims of human trafficking per 100,000 population, by sex, age group and form of exploitation” (E/CN.3/2016/2/Rev.1) This Research Brief was developed by the UNODC Research and Trend Analysis Branch under the overall coordination of...»

«UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, DC 20549 FORM 10-K ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the fiscal year ended October 31, 2015 Commission file no. 0-21964 Shiloh Industries, Inc. (Exact name of Registrant as specified in its charter) Delaware 51-0347683 (State or other jurisdiction (I.R.S. Employer of incorporation or organization) Identification No.) 880 Steel Drive, Valley City, Ohio 44280 (Address of principal...»

«Audit of Certain USAID/Dominican Republic Financial Operations Audit Report No. 1-517-01-002-F January 9, 2001 San Salvador, El Salvador U.S. AGENCY FOR INTERNATIONAL DEVELOPMENT RIG/San Salvador January 9, 2001 MEMORANDUM FOR: USAID/Dominican Republic Director, Elena Brineman FROM: RIG/San Salvador, Timothy E. Cox SUBJECT: Audit of Certain USAID/Dominican Republic Financial Operations (Report No. 1-517-01-002-F) This memorandum is our report on the subject audit. This report contains four...»

<<  HOME   |    CONTACTS
2017 www.thesis.dislib.info - Online materials, documents

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.