Is there any Justification for this web page?

I love my iPad, it is the perfect tool that I have been waiting for since the invention of the Apple Newton in 1993. Given that so much of what I desire instanous access to is now on my iPad duplicating what I now have on this blog. I now need to seriously consider the value of these Pages to me?

Posted in Uncategorized | Leave a comment

‘You’ve got to find what you love,’ Jobs says

Ripped from the pages of Stanford UNIVERSITY. As story that I have always tried to live

‘You’ve got to find what you love,’ Jobs says

This is a prepared text of the Commencement address delivered by Steve Jobs, CEO of Apple Computer and of Pixar Animation Studios, on June 12, 2005.

I am honored to be with you today at your commencement from one of the finest universities in the world. I never graduated from college. Truth be told, this is the closest I’ve ever gotten to a college graduation. Today I want to tell you three stories from my life. That’s it. No big deal. Just three stories.

The first story is about connecting the dots.

I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so before I really quit. So why did I drop out?

It started before I was born. My biological mother was a young, unwed college graduate student, and she decided to put me up for adoption. She felt very strongly that I should be adopted by college graduates, so everything was all set for me to be adopted at birth by a lawyer and his wife. Except that when I popped out they decided at the last minute that they really wanted a girl. So my parents, who were on a waiting list, got a call in the middle of the night asking: “We have an unexpected baby boy; do you want him?” They said: “Of course.” My biological mother later found out that my mother had never graduated from college and that my father had never graduated from high school. She refused to sign the final adoption papers. She only relented a few months later when my parents promised that I would someday go to college.

And 17 years later I did go to college. But I naively chose a college that was almost as expensive as Stanford, and all of my working-class parents’ savings were being spent on my college tuition. After six months, I couldn’t see the value in it. I had no idea what I wanted to do with my life and no idea how college was going to help me figure it out. And here I was spending all of the money my parents had saved their entire life. So I decided to drop out and trust that it would all work out OK. It was pretty scary at the time, but looking back it was one of the best decisions I ever made. The minute I dropped out I could stop taking the required classes that didn’t interest me, and begin dropping in on the ones that looked interesting.

It wasn’t all romantic. I didn’t have a dorm room, so I slept on the floor in friends’ rooms, I returned coke bottles for the 5¢ deposits to buy food with, and I would walk the 7 miles across town every Sunday night to get one good meal a week at the Hare Krishna temple. I loved it. And much of what I stumbled into by following my curiosity and intuition turned out to be priceless later on. Let me give you one example:

Reed College at that time offered perhaps the best calligraphy instruction in the country. Throughout the campus every poster, every label on every drawer, was beautifully hand calligraphed. Because I had dropped out and didn’t have to take the normal classes, I decided to take a calligraphy class to learn how to do this. I learned about serif and san serif typefaces, about varying the amount of space between different letter combinations, about what makes great typography great. It was beautiful, historical, artistically subtle in a way that science can’t capture, and I found it fascinating.

None of this had even a hope of any practical application in my life. But ten years later, when we were designing the first Macintosh computer, it all came back to me. And we designed it all into the Mac. It was the first computer with beautiful typography. If I had never dropped in on that single course in college, the Mac would have never had multiple typefaces or proportionally spaced fonts. And since Windows just copied the Mac, it’s likely that no personal computer would have them. If I had never dropped out, I would have never dropped in on this calligraphy class, and personal computers might not have the wonderful typography that they do. Of course it was impossible to connect the dots looking forward when I was in college. But it was very, very clear looking backwards ten years later.

Again, you can’t connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future. You have to trust in something — your gut, destiny, life, karma, whatever. This approach has never let me down, and it has made all the difference in my life.

My second story is about love and loss.

I was lucky — I found what I loved to do early in life. Woz and I started Apple in my parents garage when I was 20. We worked hard, and in 10 years Apple had grown from just the two of us in a garage into a $2 billion company with over 4000 employees. We had just released our finest creation — the Macintosh — a year earlier, and I had just turned 30. And then I got fired. How can you get fired from a company you started? Well, as Apple grew we hired someone who I thought was very talented to run the company with me, and for the first year or so things went well. But then our visions of the future began to diverge and eventually we had a falling out. When we did, our Board of Directors sided with him. So at 30 I was out. And very publicly out. What had been the focus of my entire adult life was gone, and it was devastating.

I really didn’t know what to do for a few months. I felt that I had let the previous generation of entrepreneurs down – that I had dropped the baton as it was being passed to me. I met with David Packard and Bob Noyce and tried to apologize for screwing up so badly. I was a very public failure, and I even thought about running away from the valley. But something slowly began to dawn on me — I still loved what I did. The turn of events at Apple had not changed that one bit. I had been rejected, but I was still in love. And so I decided to start over.

I didn’t see it then, but it turned out that getting fired from Apple was the best thing that could have ever happened to me. The heaviness of being successful was replaced by the lightness of being a beginner again, less sure about everything. It freed me to enter one of the most creative periods of my life.

During the next five years, I started a company named NeXT, another company named Pixar, and fell in love with an amazing woman who would become my wife. Pixar went on to create the worlds first computer animated feature film, Toy Story, and is now the most successful animation studio in the world. In a remarkable turn of events, Apple bought NeXT, I returned to Apple, and the technology we developed at NeXT is at the heart of Apple’s current renaissance. And Laurene and I have a wonderful family together.

I’m pretty sure none of this would have happened if I hadn’t been fired from Apple. It was awful tasting medicine, but I guess the patient needed it. Sometimes life hits you in the head with a brick. Don’t lose faith. I’m convinced that the only thing that kept me going was that I loved what I did. You’ve got to find what you love. And that is as true for your work as it is for your lovers. Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking. Don’t settle. As with all matters of the heart, you’ll know when you find it. And, like any great relationship, it just gets better and better as the years roll on. So keep looking until you find it. Don’t settle.

My third story is about death.

When I was 17, I read a quote that went something like: “If you live each day as if it was your last, someday you’ll most certainly be right.” It made an impression on me, and since then, for the past 33 years, I have looked in the mirror every morning and asked myself: “If today were the last day of my life, would I want to do what I am about to do today?” And whenever the answer has been “No” for too many days in a row, I know I need to change something.

Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because almost everything — all external expectations, all pride, all fear of embarrassment or failure – these things just fall away in the face of death, leaving only what is truly important. Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose. You are already naked. There is no reason not to follow your heart.

About a year ago I was diagnosed with cancer. I had a scan at 7:30 in the morning, and it clearly showed a tumor on my pancreas. I didn’t even know what a pancreas was. The doctors told me this was almost certainly a type of cancer that is incurable, and that I should expect to live no longer than three to six months. My doctor advised me to go home and get my affairs in order, which is doctor’s code for prepare to die. It means to try to tell your kids everything you thought you’d have the next 10 years to tell them in just a few months. It means to make sure everything is buttoned up so that it will be as easy as possible for your family. It means to say your goodbyes.

I lived with that diagnosis all day. Later that evening I had a biopsy, where they stuck an endoscope down my throat, through my stomach and into my intestines, put a needle into my pancreas and got a few cells from the tumor. I was sedated, but my wife, who was there, told me that when they viewed the cells under a microscope the doctors started crying because it turned out to be a very rare form of pancreatic cancer that is curable with surgery. I had the surgery and I’m fine now.

This was the closest I’ve been to facing death, and I hope it’s the closest I get for a few more decades. Having lived through it, I can now say this to you with a bit more certainty than when death was a useful but purely intellectual concept:

No one wants to die. Even people who want to go to heaven don’t want to die to get there. And yet death is the destination we all share. No one has ever escaped it. And that is as it should be, because Death is very likely the single best invention of Life. It is Life’s change agent. It clears out the old to make way for the new. Right now the new is you, but someday not too long from now, you will gradually become the old and be cleared away. Sorry to be so dramatic, but it is quite true.

Your time is limited, so don’t waste it living someone else’s life. Don’t be trapped by dogma — which is living with the results of other people’s thinking. Don’t let the noise of others’ opinions drown out your own inner voice. And most important, have the courage to follow your heart and intuition. They somehow already know what you truly want to become. Everything else is secondary.

When I was young, there was an amazing publication called The Whole Earth Catalog, which was one of the bibles of my generation. It was created by a fellow named Stewart Brand not far from here in Menlo Park, and he brought it to life with his poetic touch. This was in the late 1960′s, before personal computers and desktop publishing, so it was all made with typewriters, scissors, and polaroid cameras. It was sort of like Google in paperback form, 35 years before Google came along: it was idealistic, and overflowing with neat tools and great notions.

Stewart and his team put out several issues of The Whole Earth Catalog, and then when it had run its course, they put out a final issue. It was the mid-1970s, and I was your age. On the back cover of their final issue was a photograph of an early morning country road, the kind you might find yourself hitchhiking on if you were so adventurous. Beneath it were the words: “Stay Hungry. Stay Foolish.” It was their farewell message as they signed off. Stay Hungry. Stay Foolish. And I have always wished that for myself. And now, as you graduate to begin anew, I wish that for you.

Stay Hungry. Stay Foolish.

Thank you all very much.

Posted in Business, Uncategorized | Leave a comment

How do they figure the distance between celestial bodies?

How do they figure the distance between celestial bodies?

November 7, 2000

Dear Straight Dope:

I’ve been wondering–what is the process that we use to measure the distance of objects in space? How do we really know that a planet is 1,200 light years away?

— Mike, Rochester, NY

You probably think there’s a simple answer to this question, Mike. What’s frightening is that this is it. But we figure you’re old enough to take it.

There are a number of steps in the process, with the results of each step used to calibrate the next. To start with, we need to know the distances of things in the solar system. For this, we use something called Kepler’s Third Law. This states that for any object in the solar system, the orbital period P (in years) is related to the average radius of the orbit A by the formula P2 = A3. The period can be determined easily by going out at night and watching the planet or whatever move. Plugging in the P gives us the radius A in astronomical units, or AUs. An AU is the average distance from the earth to the sun. To figure the length of an AU, we need to measure at least one distance. Usually, this is done by sending a radar pulse to either Mars or Venus, when it’s at closest approach to the Earth. Since we now know the difference |AMars or Venus – AEarth|, and we already knew the ratio PMars or Venus/PEarth, the rest is child’s play.

OK, so now we’ve got the solar system licked. What about other stars? They’re too far away for radar to be any use. What we use here is a method called parallax (sometimes called trigonometric parallax). To see parallax on a small scale, hold one finger up at arm’s length in front of you, and look at it first with your left eye, and then with your right eye. Your finger will appear to change position relative to the background. Exactly how much depends on the distance b between your eyes (referred to as the baseline), and the distance D of your finger from your eyes. Specifically, for large distances, the formula is D = 2b/theta, where theta is the angle by which your finger shifts relative to the background, measured in radians (to get from degrees to radians, divide by 57.296). To use this method on the stars, we want the longest baseline we can easily get, which is the diameter of the Earth’s orbit. If you take a picture of an area of the sky one night, and then take another picture of the same area six months later, the nearby stars in that area of the sky will shift their positions slightly relative to the other stars. Parallaxes can be measured in this way for stars nearer than a few hundred lightyears; after that, the shift is too small to detect.

To get the distance of anything farther away than that, we usually use some sort of “standard candle” method. The apparent brightness of an object depends both on how bright it actually is, and on how far away it is. For example, if a hundred watt light bulb is 10 meters away, it’ll look exactly as bright as a 25-watt light bulb 5 meters away. A “standard candle” is an object of known brightness. If we then measure its apparent brightness, we can determine how far away it is. This can be used to determine any distance, as long as you can still see the object you’re using. For instance, we can use the distances from parallax to determine the average brightness of various colors of normal or “main sequence” stars, as they’re called. This gives us a standard candle good out to the various star clusters in our galaxy, a hundred thousand lightyears or so away.

Then, in some clusters, we see stars called Cepheid variables, whose brightness changes in a certain way with time. Their average brightness depends on their period, and if we see one in a cluster a known distance away, we can determine its brightness and thus determine exactly how the brightness depends on period, so we’ve got another standard candle. Since Cepheids are brighter than main sequence stars, we can use them at greater distances, allowing us to determine the distances of Cepheids in nearby galaxies, and thus the distances to those galaxies. This in turn allows us to determine the brightness of yet brighter standard candles, and so on. Currently, the largest distances are obtained from a certain type of exploding star called a supernova 1a. These are nearly ten billion times as bright as our sun, and can be seen nearly as far away as the edge of the visible universe. Oughta hold us for now.

— Chronos

Posted in Physics | Leave a comment

NCR NEAT 399 or simply NEAT 3

The NCR 499

Although not the first programming language, and by far not my favorite, that I taugh myself, NEAT 3 on a NCR 499 computer was my first professional consulting contract about the spring, summer period of 1979.

The Bergen News, as it was called back then, had placed an ad in their paper for a computer programmer.

In those days people were more openminded and likely to give you a chance, if you were bold enough. You could literal walk into the business, without any experience, meet with the owner asking them to give you a job if you could prove that you could do the job; in this case learn how to develope software for the NCR 499 system using NEAT 3, within a two week period.

He was willing to take the bet on me since I was willing to charge a relatively small sum compare to the NCR 499 consultants he was finding. So he gave me all the manauals for NEAT 3 and the NCR 499 system which I took home and within a week was ready to show I knew what I was talking about.

At the time NCR sales people were not happy to be told by the then owner of the Bergen News thier services were not longer needed and the work was being done by an upstart at rockbottom prices (a story later relayed to several years later by the owner, as the owner started to upgrad to IBM PC’s). I didn’t think I was charging a rockbottom price for my skills. I was just happy someone had given me a chance to professionally prove myself.

This photographs brings back grand old memories of the days developing my first set of professional software systems.

NEAT 3 as ripped from CORE MEMORY pages

NCR NEAT 399

NCR 399 Programming Language

Internal operations and certain external functions of the NCR 399 system are programmed in the 399 mnemonic language, using near-English macro instructions designed specifically for accounting applications. The programmer must supply two types of information to the application program — data descriptions and procedural instructions. Data layout and coding worksheets are provided for the completion of this information.

The data layout worksheet is used to define the format of Data Layout Worksheetthe variables on which all arithmetic and data handling operations are to be performed. This data is stored in memory in the form of numeric or alphanumeric values provided as a part of the application program or as a result of previous operations. Each data field defined by the programmer will have a header associated with it in memory. This header contains information reflecting the format of the data field as specified by the programmer on the data layout worksheet. In addition to field length, data type, decimal position, and sign, such information as print position and paper advance code can be defined to achieve automatic print control.

If the data being defined is to be printed on the serial or line printer, the programmer can specify the column number where printing is to start and the number of lines to advance the paper after printing that data lield Then, when the data field is output to the printer, it will be printed automatically in the specified column and the paper advanced if necessary. This feature is especially advantageous when formatting and printing reports; it saves memory by eliminating unnecessary coding within the program.

The NCR 399 programmer need never be concerned with keeping track of actual data addresses in memory. Meaningful symbolic names are assigned to data fields in the program. Once the programmer assigns these names, he thereafter refers to data by name only; software automatically handles the hardware addressing.

Procedural instructions are entered on the 399 coding worksheet.Coding Worksheet These instructions: (1) define the functions that must be performed to achieve the end results of the application, and (2) specify the application data that is to be operated on by the interpreter. Instructions, which are variable in length, may contain up 10 three operand addresses.

The 399 language is made up of a comprehensive set of macro-instructions that control forms movement and printing, manipulate data, perform arithmetic functions, process tables, and transfer information to and from the I/O devices. Each macro is a software instruction that, when executed by the interpreter, initiates a series of hardware functions necessary to accomplish the specified task. For instance, when the programmer codes a PUT Serial Printer instruction, the interpreter initiates execution of all those internal functions required to print on a document inserted in the forms handler.

Functions that previously had to be performed manually by the operator now are accomplished under program control, resulting in faster, more efficient operation of the system with less chance of human error. These functions include such operations as positioning the serial printer for typing, rewinding the cassette tape, positioning forms to a specified printline, opening and closing the platen, and ejecting documents.

Simple Input/Output (I/O) instructions handle the normally complex operation of transferring data to and from peripheral units. For instance, to input data from an I/O device such as the card reader, the programmer need only code a GET Card Reader instruction. Issuance of the instruction causes data to be transferred from the card reader to an area in memory defined by the programmer. The instruction performs all necessary translation (from Hollerith card code to ASCII characters) and data formatting. Likewise, to punch a card, the programmer issues a PUT Card Punch instruction, which transfers data from memory to the card punch peripheral. Certain peripherals, such as the line printer, also provide control instructions that perform such functions as paper advance and print positioning.

Because each macro can initiate the performance of many functions, program length and complexity, and programming effort are reduced significantly. The 3-address structure of the language is another powerful feature that minimizes program length. For example, with just one instruction, a programmer can direct the 399 to add the contents of one address to the contents of another address and store the result in a third address.

Variations of the add instruction allow the programmer to:

(1) Add one series of individual consecutive fields to a second series of individual consecutive fields.

Add: A1 to B1 A2 to B2 A3 to B3

(2) Add a series of individual consecutive numeric fields to a single field.

Add: A1 to B A2 to B A3 to B

(3) Add a single numeric field to a series of consecutive fields.

Add: A to B1 A to B2 A to B3

Up to 31 consecutive fields can be affected by one add instruction. The subtract, multiply, divide, and move instructions operate in much the same manner.

The NCR 399 processor controlled both internal operations such as arithmetic, decision making, and data manipulation, and external functions suASCII Codech as typing and forms alignment and movement. Normally, a processor such as that used in the 399 must be programmed at the hardware level, which means that the user either must write lengthy, complex programs or sacrifice a certain degree of flexibility by relying entirely on packaged software provided by the manufacturer. Recognizing the need to ease the user’s programming burden while still providing him with the ability to tailor the system to his own requirements, NCR decided to take an interpretive approach to programming. This approach permits the user to interface with the hardware through the medium of a simple, problem-solving programming language, enabling him to concentrate on the requirements of his system rather than on the intricacies of the equipment.

NCR 399 software was designed specifically to solve accounting problems. It simplifies the programming of even the most difficult applications — payroll, accounts receivable/accounts payable, inventory, financial reporting, utility billing, and so on. The 399 programming language provides clear, concise, near-English macro instructions that describe the steps necessary to accomplish a specified task. These instructions are entered on special 399 programming worksheets and become the application program. As the 399 processor does not recognize near-English instructions, they must be translated into a format that can be interpreted for execution by the processor. This format consists of series of bits; translation to that format is performed by an assembler program.

Program assembly

When an error-free program has been assembled or reassembled on cassette, it is read into memory, and an interpreter is tailored to that particular application by a link loading process. All communication between the user program and the 399 processor is then controlled by the interpreter.

Program execution

             

Assembling 399 Programs

Each line coded on the programming worksheets is a source line; all the source lines for a given application constitute the source program. However, the source program is not executable by the processor in this form. It must be converted to a form compatible with the interpreter for execution by the 399 processor. This conversion is performed by the 399 assembler program, which is contained on the Master Software Cassette supplied with every 399 system.

NCR 399 application programs can be assembled directly on theAssembly Specifications for NCR 399 399 unit or, optionally, on an NCR Century system. The assembly process is controlled through entries on the appropriate assembler specification worksheet.

These parameter sheets are used to supply the assembler with information about the application program and the 399 system on which it will be run, and to request that certain optional functions be performed during the assembly process itself. The worksheets become a part of the source program for input to the assembler. The source program may be punched into cards or it may be input to the assembler through the 399 alphanumeric keyboard directly from the worksheets. The operator is guided through the assembly process by the lights on the status indicator panel, which are used to request and verify his responses, request cassette changes, and indicate errors.

The 399 assembler operates in two phases. The first is a preassembly phase, which validates certain portions of the source input and writes it to cassette in preparation for the next phase. The resultant programAssembly Specifications for NCR Century on cassette is called the reassembly master. Additionally, this phase calculates memory addresses for use with the associated symbolic reference names assigned by the programmer and builds a table that specifies which interpretive subroutines are to be included in the program.

The second phase reads one source line (one 80-character record) at a time, condenses it into a series of bits that indicate to the interpreter which routines to execute and on what data fields to operate, and records the assembled program on a cassette. A source program listing is printed on the serial printer as each statement is processed. Any errors encountered during assembly are printed as a notation preceding the erroneous source line.

Reassembly permits corrections, additions, or deletions to a previously assembled program. For instance, a program can be reassembled to correct errors detected during an initial assembly or to include additional logic when peripherals or memory are added to a system. Input to the assembler during the reassembly process includes the reassembly master (the cassette tape output during the first phase of assembly) and only those source lines affected by changes, eliminating the need to enter the entire program a second time. The program is modified by matching source lines, inserting new ones, and deleting old ones where necessary.

Programs can be assembled directly on a 399 system that includes two cassette handlers. If a system has only one cassette handler, NCR will provide, on a temporary basis, a second, plug-in cassette handler to permit program assembly. An NCR Century assembler is also available for assembling 399 programs.

Program Execution

The purpose of the assembly process is to convert an applicationApplication Program program into a format that can be used by the 399 interpreter for execution of the program. The assembled program contained on cassette is referred to as an unexecutable object program because it does not yet include the interpreter. To create an executable program, an interpreter must be tailored to that particular application. This is accomplished through a series of software programs called Linking Loader, which were written to the object program cassette during assembly. First, Linking Loader calls into memory the unexecutable object program from its cassette. Next, from the Master Software Cassette it calls into memory the basic interpreter and only those optional interpretives required by the program. The result is a memory-resident executable program.

The actual amount of memory allocated to the interpreter, user program, and data areas depends on application requirements. Once an executable program is built in memory, it can be processed and/or copied to cassette.

The interpreter is the software interface between the 399 language and the minicomputer and its peripherals. It controls all I/O devices and the sequence of program operations, ensuring that the proper routines are executed as necessary. The interpreter can be divided logically into three areas, or subsystems — executive control, internal operations, and I/O operations.

Automatic sequencing of operations within the 399 is performed by executive control, which uses instructions from the application program to perform the setup operations required by internal and I/O related subsystems, to control program sequencing, and to request operator input as required. In addition, the executive provides common routines that may be used by the operating subsystems, determines which operations program is to perform the application instruction, and transfers control accordingly.

The internal operations subsystem includes all those routines associated with arithmetics, data handling, and data formatting, as well as other standard routines designed to minimize the memory requirements of the interpreter.

The I/O operations subsystem performs all operations directly associated with peripheral devices and is responsible for hardware-related error detection and correction.

The interpreter’s normal sequence of operations begins with the executive, which reads and decodes an application instruction. Using the assembled instruction (the series of bits), it searches a table for the address of the designated macro routine. When the instruction is set up, control is transferred to the appropriate routine in the internal operations subsystem. I/O related routines are accessed by the internal operations subsystem as they are needed. Upon completion of all necessary operations, control is returned to the executive, which then reads and decodes the next program instruction.

Posted in Computer Science, Uncategorized | Leave a comment

The Top 10 Reasons Leaders Fail

Ripped from the pages of CNBC.COM is something worth thinking about; Where does a person honestly rank themselves.

The Moral Failure of Leaders: Business, Sports and Politics by Jack Stark, PhD, author of “The Championship Formula: How to Transform Your Team Into a Dynasty.”

Everyone has flaws. They key to success and to being a great leader is to not have too many flaws and especially to not have a fatal flaw.

What is a fatal flaw?

Think Madoff, Enron, Dominique Strauss-Kahn, John Edwards, those caught up in the Penn State sex scandal and many others.

The Cause

The cause of the “fatal flaws” is twofold – one moral and the other psychological.

I have always had a hard time figuring out why these seemingly bright, talented, well-educated leaders could fail so spectacularly and often so abruptly.

Then I remembered my training on moral development.

Leaders often fail when they don’t pass through the six stages of moral development during their formative childhood years. People can be “stuck” in the highest stage they attain and that stage will govern their moral behavior unless there is a significant intervention and treatment process.

Stages of Moral Development

  • Stage 1 Avoiding punishment (the most basic stage of moral development often found in young children).
  • Stage 2 Serving self-interest (a dangerous stage to be stuck in morally, where the focus is “What’s in it for me?” and moral decisions are based on “What can I get out of it?”)
  • Stage 3 Seeking approval from others (the stage usually occurring on one’s teens in which one behaves morally to live up to the expectations of others).
  • Stage 4 Following authority (someone in this stage does as told and follows the rules).
  • Stage 5 Respect for social order (in this stage of moral development, one’s behavior is driven by the desire to maintain social order and avoid chaos in society).
  • Stage 6 Universal ethical principles (in this highest stage of moral development one is guided by conscience according to universal moral principles).

The followers of leaders—be they employees, colleagues, fans, members of a religious group, or the rank and file of a military branch—want and need their leaders and heroes and to have character infused with the highest level of moral development. We want to follow and believe in someone whose personality traits we admire. It’s the glue that keeps us united.

“The cause of the “fatal flaws” is twofold – one moral and the other psychological.”

Jack Stark, PhD
Author, “The Championship Formula”

 

The ability to handle high-pressure, challenging tasks can be addictive. For some high-profile leaders, that level of pressure, in combination with opportunities to behave unethically, has led to their failure. We can learn from their unethical behavior.

My experience (thousands of interviews, interventions and therapy) allowed me to have insights into deceitful, unethical leaders and uncover the top 10 reasons why leaders fail—in my opinion based on my experience. The reasons are not in any priority ranking and often multiple reasons contribute to a leader’s demise.

Top 10 Reasons Leaders Fail

  1. Greed. I’m going to get mine since everyone else is and besides, I deserve it. Look at what I have done.
  2. Insecurity. Poor self-esteem based on family experiences – shockingly high.
  3. Power. I am in control – I want and get the attention I need.
  4. Arrogance. Delusion belief: I am better than anyone else.
  5. Narcissism. Severe form of selfishness and often an inability to love others.
  6. Paranoia. Never trust anyone – no such thing as loyalty.
  7. Manic Behavior. Obsessively driven which often results in a big crash.
  8. Addictions. Drugs, alcohol, gambling and sexual compulsions.
  9. Burnout and Depression. Often hidden and at least subconsciously reasons for irrational behavior.
  10. Moral Deficiencies. Primitive moral development and rationalization and blaming.
Posted in Education, Uncategorized | Leave a comment

Unix/Linux Administration Logical Volume Management Guide

Ripped from the pages http://content.hccfl.edu/pollock/AUnix1/LVM.htm

 

Unix/Linux Administration
Logical Volume Management Guide

©2005 Wayne Pollock, Tampa Florida USA.
All Rights Reserved.

Adapted from the LVM How-to[external link] page
from The Linux Documentation Project[external link] website.

 

LVM (Logical Volume Management) Overview:

In the olden days computer disks were small compared to the data set sizes that were needed.  A solution is to make several physical disks appear virtually (or logically) as a single much larger disk.  A large filesystem could then be created on that virtual disk.

Later hard disk sizes grew large enough that it made sense to do the opposite: make one physical disk appear as several virtual disks.  Each virtual disk holds a filesystem independently of the others.  Today such virtual disks are called disk “partitions“.

Now we have come full circle.  Large data warehouse applications require very large filesystems to hold the database data files.  To support this sort of application the old idea of combining several disks into one has been resurrected.  Novell Netware supported this feature since the 1990s.  The physical disk (or selected disk partitions) are formatted to be “physical volume segments“.  All added physical volume segments become part of a single large virtual disk.  The administrator can then create logical “volumes“, that is, a filesystem.  The exciting part is that if some volume is low on space, you can extend the virtual disk by adding another physical volume segment to it, and then increase the (logical) volume’s size.  This operation is fast and doesn’t disturb the existing data or other partitions (or volumes)!

The modern Unix (and Linux) version of this idea is called “Logical Volume Management” (or “LVM”).  LVM allows the administrator to

  • use and allocate disk space more efficiently and flexibly
  • move logical volumes between different physical devices
  • have very large logical volumes span a number of physical devices
  • take snapshots of whole filesystems easily, allowing on-line backup of those filesystems
  • replace on-line drives without interrupting services

Linux also supports software RAID, which, like LVM, can be used to provide disk striping.  The two systems are independent of each other.  So you can use RAID to provide striping and use that RAID volume as a physical volume for LVM.  There is no reason to use both software RAID and LVM, although it can be done.  However it does make good sense to use hardware RAID and LVM together.

As a quick example of what can be done, suppose you need to grow some filesystem by 500 MB.  Without LVM, you must either have a spare 500 MB on the same physical disk or the entire filesystem must be moved to another (larger) disk.  Even if you have the space, all the other partitions (and their filesystems) must be moved, so the 500 MB is at the end of the partition you wish to grow.  Then all the data must be backed up, the partition changed, and the old data restored.  While today there are tools such as parted that can manage this for regular partitions (not LVs), using LVM you can quickly and safely grow a filesystem in this simple way:

  root# umount /dev/vg1/lv1
  root# e2fsadm /dev/vg1/lv1 -L+500M
  root# mount /dev/vg1/lv1

The free space can be anywhere, even on another disk!

LVM for Linux has had two major versions (so far).  Both have many more features and controls than the older Novell scheme.  While the additional controls and features aren’t always needed, you still must understand them in order to setup and manage LVM.  LVM is also available in recent Solaris and other Unix systems, however different terms may be used for the same concepts.  The Veritas Volume Manager, (“VVM” or “VxVM”) is a proprietary but popular logical volume manager from Veritas (now part of Symantec).  It is available for Windows, AIX, Solaris, Linux, and HP-UX (where it is the system default LVM).  Also take a look at the Enterprise Volume Management System (a sourceforge.net project).

 

Basic Concepts and Terminology:

[relationship of PVs, VGs, and LVs] 

Figure 1

With LVM, physical volume segments are simply called “physical volumes” (or “PVs“).  These PVs are usually entire disks but may be disk partitions.  The PVs in turn are combined to create one or more large virtual disks called “volume groups” (or “VGs“).  While you can create many VGs, one may be sufficient.  A VG can grow or shrink by adding or removing PVs from it.  VGs appear to be block devices, similar to other disks such as /dev/hda.  In fact each VG can be referred to by the name “/dev/VG_name“.

Once you have one or more volume groups you can create one or more virtual partitions called “logical volumes” (or “LVs“).  Note each LV must fit entirely within a single VG.  LVs appear to be block devices similar to disk partitions such as /dev/hda1, with entries named “/dev/VG_name/LV_name“.  LVs have a number of parameters that can be set (and later most can be changed) that can affect disk I/O performance, including extent size, chunk size, stripe size and stripe set size, and read-ahead.  These are discussed below.

Term Meaning
PE Physical Extent
LE Logical Extent
PV Physical Volume
VG Volume Group
LV Logical Volume

Table 1: LVM Acronyms

Finally, you can create any type of filesystem you wish on the logical volume, including as swap space.  Note that some filesystems are more useful with LVM than others.  For example not all filesystems support growing and shrinking.  ext2, ext3, xfs, and reiserfs do support such operations and would be good choices.  The relationship between PVs, VGs, and LVs is illustrated in figure 1.  Table 1 summarizes these acronyms.

 

Extents:

When creating a volume group from one or more physical volumes, you must specify the size of the “extents” of each of the physical volumes that make up the VG.  Each extent is a single contiguous chunk of disk space, typically 4M in size, but can range from 8K to 16G in powers of 2 only.  (Extents are analogous to disk blocks or clusters.)  The significance of this is that the size of logical volumes are specified as a number of extents.  Logical volumes can thus grow and shrink in increments of the extent size.  A volume group’s extent size cannot be changed after it is set.

The system internally numbers the extents for both logical and physical volumes.  These are called logical extents (or LEs) and physical extents (or PEs), respectively.  When a logical volume is created a mapping is defined between logical extents (which are logically numbered sequentially starting at zero) and physical extents (which are also numbered sequentially).

To provide acceptable performance the extent size must be a multiple of the actual disk cluster size (i.e., the size of the smallest chunk of data that can be accessed in a single disk I/O operation).  In addition some applications (such as Oracle database) have performance that is very sensitive to the extent size.  So setting this correctly also depends on what the storage will be used for, and is considered part of the system administrator’s job of tuning the system.

 

Booting Consideration with LVM:

LVM can be use with RAID.  LVM can be used to hold all filesystems.  However special considerations apply when using LVs for the boot and root filesystems.  This is because the BIOS code in the ROM of the motherboard of your computer must be able to locate and load the kernel.  So if the boot partition was a LV, the BIOS would need to know about PVs, VGs, and LVs, and it probably doesn’t.  Unless you are using some custom BIOS you must not make the bootable partition an LV.

The root partition (if it isn’t also the boot partition) may be a logical volume.  However this means the kernel must access the root partition before it can load any (e.g., LVM) kernel modules.  Thus the modules for LVM must be compiled into the kernel.  This is rarely the case with standard distributions!  (There is a similar issue with SCSI drivers, as most kernels only compile in the IDE drivers.)  For this reason, as well as allowing a filesystem to be accessed by another operating system (yes there are ext2 drivers available for Windows), some system administrators prefer to make the root filesystem on a regular partition rather than on a logical volume.  Note in this case you can make a single root+boot partition.

The solution to using a logical volume for your root filesystem (as it is with SCSI) is either to build a custom kernel with the correct drivers compiled in, or to make sure the system loads a RAM disk initially, known as initrd, which contains all the correct modules.  This RAM disk then loads the system as normal, and goes away.  Creating a ramdisk on Linux is simple using the mkinitrd script.  Just run this command (as root). You need to know the kernel version, and then you must update grub.conf to use the ramdisk:

/root# KERNEL_VERSION=`uname -r`
/root# mkinitrd -v initrd.$KERNEL_VERSION $KERNEL_VERSION
/root# mv initrd.$KERNEL_VERSION /boot
/root# vi /boot/grub/grub.conf  # or /boot/grub/menu.lst

Before logical volumes can be mounted, the LVM driver must be loaded (or compiled in) to the kernel.  Next all physical volumes on all available drives must be found and examined, in order to determine all the volume groups.  Finally the volume groups must be activated, which causes the kernel to recognize the various block devices.  Only then can the filesystems within logical volumes be mounted. So most systems add code similar to the following to the boot up scripts (typically the rc.sysinit script):

# LVM2 initialization
if [ -x /sbin/lvm.static ]  # Check for LVM v2
then
   # make sure device mapper (LVM2) kernel module is loaded:
   if ! grep -q "device-mapper" /proc/devices 2>/dev/null
   then
      modprobe dm-mod >/dev/null 2>&1
   fi
   # Cleanup and then recreate device mapper control file:
   /bin/rm -f /dev/mapper/control
   echo "mkdmnod" | /sbin/nash --quiet >/dev/null 2>&1
   if [ -c /dev/mapper/control ]  # if LVM2 is loaded:
   then  # Check for any physical volumes:
      if /sbin/lvm.static vgscan > /dev/null 2>&1
      then
         echo "Setting up Logical Volume Management:"
         # Activate volume groups and re-create all /dev entries:
         /sbin/lvm.static vgchange -a y && /sbin/lvm vgmknodes
      fi
   fi
fi

You may want to edit the file /etc/init.d/halt to deactivate the volume groups at shutdown.  However this shouldn’t be necessary when using LVM version 2.  To deactivate volume groups, insert the following near the end of this file (just after the filesystems are mounted read-only and before the comment that says “Now halt or reboot“):

# Deactivate LVM:
if [ -x /sbin/lvm.static ]  # Check for LVM v2
then
   echo "Deactivating LVM volume groups:"
   /sbin/lvm.static vgchange -a n
fi

Like all storage devices data may become corrupted over time.  LVM provides a command “vgck” you can use to periodically check the consistency of your volume groups.  It may pay to add this command to the bootup scripts.

 

Linear and Striped Mapping:

Let’s suppose we have a volume group called VG1, and this volume group has a physical extent size of 4M.  Suppose too this volume group is composed of one disk partition /dev/hda1 and one whole disk /dev/hdb.  These will become physical volumes PV1 and PV2 (more meaningful names for a particular scenario can be given if desired).

The PVs are different sizes and we get 99 (4M) extents in PV1 and 248 extents in PV2, for a total of 347 extents in VG1.  Now any number of LVs of any size can be created from the VG, as long as the total number of extents of all LVs sums to no more than 347.  To make the LVs appear the same as regular disk partitions to the filesystem software, the logical extents are numbered sequentially within the LV.  However some of these LEs may be stored in the PEs on PV1 and others on PV2.  For instance LE[1] of some LV in VG1 could map onto PE[51] of PV1, and thus data written to the first 4M of the LV is in fact written to the 51st extent of PV1.

When creating LVs an administrator can choose between two general strategies for mapping logical extents onto physical extents:

  1. Linear mapping will assign a range of PE’s to an area of an LV in order (e.g., LE 1–99 map to PV1‘s PEs, and LE 100–347 map onto PV2‘s PEs).
  2. Striped mapping will interleave the disk blocks of the logical extents across a number of physical volumes.  You can decide the number of PVs to stripe across (the stripe set size), as well as the size of each stripe.

When using striped mapping, all PVs in the same stripe set need to be the same size.  So in our example the LV can be no more than 198 (99 + 99) extents in size.  The remaining extents in PV2 can be used for some other LVs, using linear mapping.

The size of the stripes is independent of the extent size, but must be a power of 2 between 4K and 512K.  (This value n is specified as a power of 2 in this formula: (2^n) × 1024 bytes, where 2 ≤ n ≤ 9.)  The stripe size should also be a multiple of the disk sector size, and finally the extent size should be a multiple of this stripe size.  If you don’t do this, you will end up with fragmented extents (as the last bit of space in the extent will be unusable).

Tables 2 and 3 below illustrate the differences between linear and striped mapping.  Suppose you use a stripe size of 4K, an extent size of 12K, and a stripe set of 3 PVs (PVa, PVb, and PVc), each of which is 100 extents.  Then the mapping for an LV ( whose extents we’ll call LV1, LV2, …) to PVs (whose extents we’ll call PVa1, PVa2, …, PVb1, PVb2, …, PVc1, PVc2, …) might look something like the following.  (In this table the notation means volume_name extent_number . stripe_number):

Example of Linear Mapping
Logical Extents Physical Extents
LV1 PVa1
LV2 PVa2
LV3 PVa3
LV4 PVa4
LV99 PVa99
LV100 PVb1
LV101 PVb2
LV199 PVb99
LV200 PVc1
LV201 PVc2
Example of Striped Mapping
Logical Extents Physical Extents
LV1.1 PVa1.1
LV1.2 PVb1.1
LV1.3 PVc1.1
LV2.1 PVa1.2
LV2.2 PVb1.2
LV2.3 PVc1.2
LV3.1 PVa1.3
LV3.2 PVa1.3
LV3.3 PVa1.3
LV4.1 PVa2.1
LV4.2 PVb2.1
LV4.3 PVc2.1

Tables 2 and 3: Linear versus Striped Mapping

In certain situations striping can improve the performance of the logical volume but it can be complex to manage.  However note that striped mapping is useless and will in fact hurt performance, unless the PVs used in the stripe set are from different disks, preferably using different controllers.

(In version 1 of LVM LVs created using striping cannot be extended past the PVs on which they were originally created.  In the current version (LVM 2) striped LVs can be extended by concatenating another set of devices onto the end of the first set.  However this could lead to a situation where (for example) a single LV ends up as a 2 stripe set, concatenated with a linear (non-striped) set, and further concatenated with a 4 stripe set!

 

Snapshots:

A wonderful facility provided by LVM is a snapshot.  This allows an administrator to create a new logical volume which is an exact copy of an existing logical volume (called the original), frozen at some point in time.  This copy is read-only.  Typically this would be used when (for instance) a backup needs to be performed on the logical volume but you don’t want to halt a live system that is changing the data.  When done with the snapshot the system administrator can just unmount it and then remove it.  This facility does require that the snapshot be made at a time when the data on the logical volume is in a consistent state, but the time the original LV must be off-line is much less than a normal backup would take to complete.

In addition the copy typically only needs about 20% or less of the disk space of the original.  Essentially, when the snapshot is made nothing is copied.  However as the original changes, the updated disk blocks are first copied to the snapshot disk area before being written with the changes.  The more changes are made to the original, the more disk space the snapshot will need.

When creating logical volumes to be used for snapshots, you must specify the chunk size.  This is the size of the data block copied from the original to the snapshot volume.  For good performance this should be set to the size of the data blocks written by the applications using the original volume.  While this chunk size is independent of both the extent size and the stripe size (if striping is used), it is likely that the disk block (or cluster or page) size, the stripe size, and the chunk size should all be the same.  Note the chunk size must be a power of 2 (like the stripe size), between 4K and 1M.  (The extent size should be a multiple of this size.)

You should remove snapshot volumes as soon as you are finished with them, because they take a copy of all data written to the original volume and this can hurt performance.  In addition, if the snapshot volume fills up errors will occur.

 

LVM Administration — Commands and Procedures:

The lvm command permits the administrator to perform all LVM operations using this one interactive command, which includes built-in help and will remember command line arguments used from previous commands for the current command.  However each LVM command is also available as a stand-alone command (that can be scripted).  These are discussed briefly below, organized by task.  See the man page for the commands (or use the built-in help of lvm) for complete details.

 

Plan the Disk Layout:

Disk I/O is often the determining factor in overall system performance.  If your system has multiple disks and controllers, the correct strategy is to have them all used in parallel (that is, simultaneously) as much of the time as possible.  In addition you should aim to place files in such a way as to minimize disk head movement (and thus minimize seek time).

While the performance improvements are real they often aren’t significant.  Whether or not to worry about these issues depends upon the current performance, file size, type, and access patterns, and which applications are running.  Some ways to maximize performance are:

  • To support a large number of users, say on a development system, you can use linear mapping of your logical volumes to a large number of physical volumes on different disks, hopefully using different controllers as well.
  • On a production system running several different services, the files for each service should be placed on different physical disks (so for example the web server I/O won’t interfere with the FTP server I/O).
  • On a database server the files for the following should be placed on different disks: tables, their indexes, log files, and frequently used tables (that don’t get cached in RAM).
  • If possible it pays to keep the operating system files (i.e., boot, root, /var/log, swap) on one disk and the rest of the system on another, so the system’s disk I/O doesn’t interfere with the service’s I/O.
  • Large files are often accessed sequentially.  Such access is most efficient when the data blocks of the file are contiguous, at the outer edge of the disk (cylinder zero).  In addition striping across different disks is very helpful in minimizing disk head movement and allowing parallel read/write operations.
  • Large sequentially accessed files, such as those on a web or FTP server can also benefit from a LVM feature called read ahead.  This allows the next block to be read from disk to RAM at the same time as the current block, useful when the data is contiguous.  (Non-LVM disk read-ahead is also settable in Linux, by changing some parameters in /proc/sys/vm.)
  • On the other hand, randomly accessed files benefit from a more central placement on a disk, as this will tend to minimize disk seek time.
  • Using striped mapping can speed up access to large, sequentially accessed files, as more disks (and controllers) are used simultaneously.  Smaller, randomly accessed files don’t benefit much from striping.
  • To avoid striping performance problems you should format one PV per whole disk.  LVM can’t tell if two PVs are on the same physical disk or not, so if you create a multiple PVs per disk and then create striped LVs, the stripes could be on different partitions on the same disk.  This would result in a decrease in performance rather than an increase.  If you do format a disk with multiple PVs, make sure no two of them are added to the same volume group if striped mapping will be used!
  • For a production system it is recommended that you create one PV per whole disk, for Administrative convenience.  It’s easier to keep track of the hardware in a system if each real disk only appears once.  This becomes particularly true if a disk fails.

Other factors affect the disk I/O performance.  Often these other factors overshadow any performance gains from careful disk layout.  One example is the disk scheduler in the kernel.  The 2.6 version of the Linux kernel comes with two different schedulers you can select between.  In addition the kernel supports features such as read-ahead independently of any LVM settings.

Another issue is the disk controller settings.  Sometimes the disk schedules disk I/O regardless of the kernel.  Disks also have configurable settings, including DMA, buffering, etc.  These can be changed with “hdparam” and other utilities.

Other factors that affect performance include bus type and speed, and what other devices are attached to that bus.

 

Format Physical Volumes (PVs)

To initialize a disk or disk partition as a physical volume you just run the “pvcreate” command on the whole disk.  For example:

  pvcreate /dev/hdb

This creates a volume group descriptor at the start of the second IDE disk.  You can initialize several disks and/or partitions at once.  Just list all the disks and partitions on the command line you wish to format as PVs.

Sometimes this procedure may not work correctly, depending on how the disk (or partition) was previously formatted.  If you get an error that LVM can’t initialize a disk with a partition table on it, first make sure that the disk you are operating on is the correct one!  Once you have confirmed that /dev/hdb is the disk you really want to reformat, run the following dd command to erase the old partition table:

# Warning DANGEROUS!
# The following commands will destroy the partition table on the
# disk being operated on.  Be very sure it is the correct disk!
dd if=/dev/zero of=/dev/hdb bs=1k count=1 blockdev \
    --rereadpt /dev/hdb

For partitions run “pvcreate” on the partition:

    pvcreate /dev/hdb1

This creates a volume group descriptor at the start of the /dev/hdb1 partition.  (Note that if using LVM version 1 on PCs with DOS partitions, you must first set the partition type to “0x8e” using fdisk or some other similar program.)

 

Create Volume Groups (VGs)

Use the “vgcreate” program to group selected PVs into VGs, and to optionally set the extent size (the default is 4MB).  The following command creates a volume group named “VG1” from two disk partitions from different disks:

    vgcreate VG1 /dev/hda1 /dev/hdb1

Modern systems may use “devfs” or some similar system, which creates symlinks in /dev for detected disks.  With such systems names like “/dev/hda1” are actually the symlinks to the real names.  You can use either the symlink or the real name in the LVM commands, however the older version of LVM demanded you use the real device names, such as /dev/ide/host0/bus0/target0/lun0/part1 and /dev/ide/host0/bus0/target1/lun0/part1.

You can also specify the extent size with this command using the “-s size” option, if the 4Mb default not what you want.  The size is a value followed by one of k (for kilobytes), m (megabytes), g (gigabytes), or t (tetrabytes).  In addition you can put some limits on the number of physical or logical volumes the volume can have.  You may want to change the extent size for performance, administrative convenience, or to support very large logical volumes.  (Note there may be kernel limits and/or application limits on the size of LVs and files on your system.  For example Linux 2.4 kernel has a max size of 2TB.)

The “vgcreate” command adds some information to the headers of the included PVs.  However the kernel modules needed to use the VGs as disks aren’t loaded yet, and thus the kernel doesn’t “see” the VGs you created.  To make the VGs visible you must activate them.  Only active volume groups are subject to changes and allow access to their logical volumes.

To activate a single volume group VG1, use the command:

    vgchange -a y /dev/VG1

(“-a” is the same as “--available“.)  To active all volume groups on the system use:

    vgchange -a y
 

Create Logical Volumes (LVs)

Creating a logical volume in some VG is the most complex part of LVM setup, due to the many options available.  The basic command syntax is:

    lvcreate options size VG_name

Where size is either “-l num_extents” or “-L num_bytes“, where num_bytes is a number followed by one of k, m, g, or t.  If this second form is used you may not get an LV of that exact size, as LVs are always a whole number of extents.  You can also use “--extents” for “-l” or “--size” for “-L“.

One of the most common options is “-n name” (you can use “--name” for “-n“) to specify a name for the logical volume.  If you don’t use this option than the LVs are named automatically “lvol1“, “lvol2“, “lvol3“, etc.

Other options include “-C y” (or “--contiguous y“) to create an LV with contiguous allocation, “-i num_stripes -I stripe_size” to create an LV with striped mapping (stripe_size is a number between 2 and 9, as described above), and “-r num_sectors” (or “--readahead num_sectors“) to set the amount of read ahead to a value between 2 and 120.

Another form of this command is used to create snapshot volumes.  This will be discussed below.

Some examples of creating logical volumes LV1 and LV2 from the volume group VG1:

# To create a 20GB linear LV named "LV1" for some VG named
# "VG1" and its block device special file "/dev/VG1/LV1":
lvcreate -L 20g -n LV1 VG1

# To create a LV of 100 extents with 2 stripes and stripe size 4 KB:
lvcreate -i 2 -I 4 -l 100 -n LV2 VG1

If you want to create an LV that uses the entire VG, use the “vgdisplay” command to find the “Total PE” size, then use that when running lvcreate.

Once the LVs have been created you can format them with filesystems (or as swap space) using standard tools such as “mkfs“.  If the new filesystem can be successfully mounted, a final step is to edit the /etc/fstab file and possibly the rc.sysinit file, so that the volumes are mounted automatically at boot time.  It may also be necessary to setup an initial ramdisk for booting (if the “root” filesystem is built on a logical volume).

 

Create and Use a Snapshot

To create a snapshot of some existing LV, a form of the lvcreate command is used:

root# lvcreate size option -s -n name existing_LV

where size is as discussed previously, “-s” (or “--snapshot“) indicates a snapshot LV, “-n name” (or “--name name“) says to call the snapshot LV name.  The only option allowed is “-c chunk_size” (or “--chunksize chunk_size“), where chunk_size is specified as a power of 2 in this formula: (2^chunk_size) × 1024 bytes, where 2 ≤ chunk_size ≤ 10.)

Suppose you have a volume group VG1 with a logical volume LV1 you wish to backup using a snapshot.  you can estimate the time the backup will take, and the amount of disk writes that will take place during that time (plus a generous fudge factor), say 300MB.  Then you would run the command:

root# lvcreate -l 300m -s -n backup LV1

to create a snapshot logical volume named /dev/VG1/backup which has read-only access to the contents of the original logical volume named /dev/VG1/LV1 at point in time the snapshot was created.  Assuming the original logical volume contains a file system you now mount the snapshot logical volume on some (empty) directory, then backup the mounted snapshot while the original filesystem continues to get updated.  When finished, unmount the snapshot and delete it (or it will continue to grow as LV1 changes, and eventually run out of space).

Note: If the snapshot is of an XFS filesystem, the xfs_freeze command should be used to quiesce the filesystem before creating the snapshot (if the filesystem is mounted):

/root# xfs_freeze -f /mnt/point;
/root# lvcreate -L 300M -s -n backup /dev/VG1/LV1
/root# xfs_freeze -u /mnt/point
Warning	Full snapshot are automatically disabled

Now create a mount-point (an empty directory) and mount the volume:

/root# mkdir /mnt/dbbackup
/root# mount /dev/VG1/backup /mnt/dbbackup
mount: block device /dev/ops/dbbackup is write-protected, mounting read-only

If you are using XFS as the filesystem you will need to add the “nouuid” option to the mount command as follows:

/root# mount /dev/VG1/backup /mnt/dbbackup -o nouuid,ro

Do the backup, say by using tar to some “DDS4″ or “DAT” tape backup device:

/root# tar -cf /dev/rmt0 /mnt/dbbackup
tar: Removing leading `/' from member names

When the backup has finished you unmount the volume and remove it from the system:

root# umount /mnt/dbbackup
root# lvremove /dev/VG1/backup
lvremove -- do you really want to remove "/dev/VG1/backup"? [y/n]: y
lvremove -- doing automatic backup of volume group "VG1"
lvremove -- logical volume "/dev/VG1/backup" successfully removed
 

Examining LVM Information

To see information about some VG use:

vgdisplay some_volume_group
vgs some_volume_group

To see information about some PV use the command:

pvdisplay some_disk_or_partition # e.g., /dev/hda1
pvs some_disk_or_partition

To see information about some LV use:

lvdisplay some-logical-volume
lvs some-logical-volume

The man pages for these commands provides further details.

 

Grow VGs, LVs, and Filesystems

To grow a filesystem, you must install a new hard disk (unless you have free space available), format it as a PV,add that PV to your VG, then add the space to your LV, and finally use the filesystem tools to grow it.  (Not all filesystem allow or come with tools to grow and shrink them!)

VGs are resizable (spelled in Linux as “resizeable”) by adding or removing PVs from them.  However by default they are created as fixed in size.  To mark a VG as resizable use the command:

root# vgchange -x y  #or --resizeable y

Once this is done add a PV (say “hdb2″) to some VG (say “VG1″) with the command:

root# vgextend VG1 /dev/hdb2

Next, extend an LV with the “lvextend” command.  This command works almost the same as the “lvcreate” command, but with a few different options.  When specifying how much to increase the size of the LV, you can either specify how much to grow the LV with “+size” or you can specify the new (absolute) size (by omitting the plus sign).  So to extend the LV “LV1″ on VG “VG1″ by 2GB, use:

root# lvextend -L +2G /dev/VG1/LV1

You could also use:

root# lvresize -L +2G /dev/VG1/LV1

It would be a good idea to use the same mapping as the original LV, or you will have strange performance issues!  Also note this command can be used to extend a snapshot volume if necessary.

After you have extended the logical volume the last step is to increase the file system size.  How you do this depends on the file system you are using.  Most filesystem types come with their own utilities to grow/shrink filesystems, if they allow that.  These utilities usually grow to fill the entire partition or LV, so there is no need to specify the filesystem size.

Some common filesystem utilities are (assume we are expanding the /home filesystem in LV1 on VG1):

  • EXT2 and EXT3:
    EXT2/3 filesystems must be unmounted before they can be resized.  The commands to use are: 

    root# umount /home # /home is the mount point for /dev/VG1/LV1
    root# fsck -f /home # required!
    root# resize2fs /dev/VG1/LV1 # grow FS to fill LV1.
    root# mount /home
  • Reiserfs:
    The Reiserfs file system can be safely resized while mounted.  If unmounted resizing is preferred, first umount and afterward mount the filesystem.  For online resizing just use: 

    root# resize_reiserfs -f /dev/VG1/LV1
  • XFS:
    The XFS file systems must be mounted to be resized, and the mount-point is specified rather than the device name: 

    root# xfs_growfs /home
  • JFS:
    Like XFS, the JFS file system must be mounted to be resized and the mount-point is specified rather than the device name.  JFS doesn’t have a special utility for resizing, but the mount command has an option that can be used: 

    root# mount -o remount,resize /home

    In some cases the exact number of blocks must be specified.  (A kernel bug in some older Linux versions prevents the LV size from being determined automatically.)  For example to resize a JFS file system that has a 4KB block size (the default) to 4GB, you must use 1M 4KB-blocks.  Now “1M” is 2 raised to the power of 20 (=1048576), so use:

    root# mount -o remount,resize=1048576 /home
 

Shrink VGs, LVs, and Filesystems

To shrink a filesystem, you perform the same steps for growing one but in reverse order.  You first shrink the filesystem, then remove the space from the LV (and put it back into the VG).  Other LVs in the same VG can now use that space.  To use it in another VG, you must remove the corresponding PV from the one VG and add it to the other VG.

To shrink a LV you must first shrink the filesystem in that LV.  This can be done with the resize2fs for EXT2/3, or resize_reiserfs for ReiserFS (doing this off-line is safer but not required).  There are similar tools for other filesystem types.  Here’s an example of shrinking /home by 1 GB:

# df
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             145M   16M  122M  12% /boot
/dev/mapper/vg01-lv01  49G  3.7G   42G   9% /home
...
# umount /home
# fsck -f /home # required!
fsck 1.38 (30-Jun-2005)
e2fsck 1.38 (30-Jun-2005)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/home: 32503/6406144 files (0.3% non-contiguous), 1160448/12845056 blocks
# resize2fs -p /dev/vg01/lv01 48G
resize2fs 1.38 (30-Jun-2005)
Resizing the filesystem on /dev/vg01/lv01 to 12799788 (4k) blocks.
Begin pass 3 (max = 9)
Scanning inode table          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/vg01/lv01 is now 12799788 blocks long.

Currently XFS and JFS filesystem types do not support shrinking.  If a newer version of these filesystems will support this, mount may have been updated to support these filesystem types.  (And if not a new tool may be released.)  For such filesystems you can resize them the hard way: Backup the data using some archive tool (e.g., cpio, tar, star, or you can copy the data to some other disk).  Then delete the filesystem in the LV, then shrink the LV, then recreate the new (smaller) filesystem, and finally restore the data.

Once the filesystem has been shrunk it is time to shrink the logical volume.  You can use either the lvreduce command or the lvresize command.  Continuing from the example above:

# lvresize -L -1G /dev/vg01/lv01
  Rounding up size to full physical extent 96.00 MB
  WARNING: Reducing active logical volume to 48 GB
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce lv01? [y/n]: y
  Reducing logical volume lv01 to 48 GB
  Logical volume lv01 successfully resized
# mount /home

To shrink a VG (say “VG1“), a PV (say “hdc“) can be removed from it if none of that PV’s extents (the PEs) are in use by any LV.  Run the command:

root# vgreduce VG1 /dev/hdc

You might want to do this to upgrade or replace a worn-out disk.  If the PV is in use by some LV, you must first migrate the data to another available PV within the same VG.  To move all the data from a PV (say “hdb2“) to any unused, large enough PV within that VG, use the command:

root# pvmove /dev/hdb2
 

Delete LVs and VGs

A logical volume (say “LV3” on the volume group “VG2“) must be unmounted before it can be removed.  The steps for this are simple:

root# umount /dev/VG2/LV3
root# lvremove /dev/VG2/LV3

Before a volume group (say “VG2“) is removed you must first deactivate it.  This is done with the command:

root# vgchange -a n VG2

Now the VG can be removed.  This of course will destroy all LVs within it.  The various PVs that made up that VG can then be re-assigned to some other VGs.  Remove (a non-active) volume group with:

root# vgremove VG2
 

Summary and Examples

In the following examples assume that LVM2 is installed and up to date, and the boot scripts have been modified already if needed.  The first example includes some commentary and some command output; the second is much shorter but uses the long option names just for fun.

Home directory Example

In this example we will create a logical volume to hold the “/home” partition for a multi-media development system.  The system will use a standard EXT3 filesystem of 60 GB, built using 3 25GB SCSI disks (and no hardware RAID).  Since multi-media uses large files it makes sense to use stripe mapping and read-ahead.  We will call the volume group “vg1” and the logical volume “home“:

  1. Initialize the disks as PVs:
    /root# pvcreate /dev/sda /dev/sdb /dev/sdc
  2. Create a Volume Group, then check it’s size:
    /root# vgcreate vg1 /dev/sda /dev/sdb /dev/sdc
    /root# vgdisplay
    vgdisplay
    --- Volume Group ---
    VG Name	          vg1
    VG Access             read/write
    VG Status             available/resizable
    VG #                  1
    MAX LV                256
    Cur LV                0
    Open LV               0
    MAX LV Size           255.99 GB
    Max PV                256
    Cur PV                3
    Act PV                3
    VG Size               73.45 GB
    PE Size               4 MB
    Total PE              18803
    Alloc PE / Size       0 / 0
    Free  PE / Size       18803/ 73.45 GB
    VG UUID               nP2PY5-5TOS-hLx0-FDu0-2a6N-f37x-0BME0Y
  3. Create a 60 GB logical volume, with stripe set of 3 PVs and stripe size of 4 (which shows 2^4 KB = 16KB):
    /root# lvcreate -i 3 -I 4 -L 60G -n home vg1
    lvcreate -- rounding 62614560 KB to stripe boundary size 62614560 KB / 18803 PE
    lvcreate -- doing automatic backup of "vg1"
    lvcreate -- logical volume "/dev/vg1/home" successfully created
  4. Create an EXT3 filesystem in the new LV:
    /root# mkfs -t ext3 /dev/vg1/home
  5. Test the new FS:
    /root# mount /dev/vg1/home /mnt
    /root# df | grep /mnt
    /root# umount /dev/vg1/home
  6. Update /etc/fstab with the revised entry for /home.
  7. Finally, don’t forget to update the system journal.

Oracle Database Example

In this example we will create 2 LVs for an Oracle database.  Oracle manages its own striping and read-head/caching, so we won’t use these LVM features.  However using hardware RAID is useful, so we will use two RAID 10 disks, hdb and hdc.  The tables will use one logical volume called “tables” on one disk and the indexes and control files will be on a second LV called “indexes” on the other disk.  Both LVs will exist in the VG called “db“.  Both filesystems will be XFS, for good performance for large database files:

/root# pvcreate /dev/hdb /dev/hdc
/root# vgcreate db /dev/hdb /dev/hdc
/root# lvcreate --size 200G --name tables db
/root# lvcreate --size 200G --name indexes db
/root# mkfs -t xfs /dev/db/tables
/root# mkfs -t xfs /dev/db/indexes
/root# vi /etc/fstab
/root# vi ~/system-journal

 


Send comments and questions to pollock@acm.org
Last updated by Wayne Pollock on 09/26/2011 03:08:38.
Posted in Computer Science | Leave a comment

Bonds

Bonds are loans that investors make to corporation and goverments. The borrowers get the cash they need while the lender earn interest.

Has fixed maturity date when the bond expires and the loan must be paid back in full at par value.

The interest a bond pays is also set when the bond is issued.

Investors can buy bonds issued by U.S. companies, by the U.S. Treasury, by various cities and states and various federal, state and local goverment agencies.  Many overseas companies and goverments also sell bonds to U.S. inverstors. When those bonds are sold in dollars rather than the currency of the issuing country, they’re sometimes know as yankee bonds.

The life, or term, of any bond is fixed at the time of issue. It can range from short-term (usually a year of less), to intermediate-term(two to ten years), to long-term(30 years or more)

The relationship between the interest rates paid on short-term and long-term bonds is called the yield curve

People try to make money on bonds by trading them because of interest rate fluctuation

The value of a bond is determined by the interest it pays and what’t happening in the economy.

If the bond inversto buys at par, and holds the bond to maturity, inflation (or the shrinking value of the dollar) is the worst enemy. The longer the maturity of the bond, the greater this risk that at some point inflation will rise dramatically and reduce the value of the dolllars that the invertor is repaid.

The bond prices fluctuate according to the interest rate the bond pays, the degree of certainty of repayment and the overall economic conditions  –especially the rate of inflation– which influence interest rates.

Generally, when inflation is up, interest rates go up. And conversely, when inflation is low, so are interest rates.

The Yield of a bond is what you actually earn. Most charts express current yield as a percentage; that is if you have a bond that pays n%, it means your interest payments will be n% of the what you pay for the bond.

The formula for yield is (annual interest)/(price paid)

Posted in Business, Language of Business | Leave a comment

OpenMP fast track

Ripped from the page ”Guide into OpenMP: Easy multithreading programming for C++

By Joel Yliluoma, September 2007; last update in December 2008

Abstract

This document attempts to give a quick introduction to OpenMP (as of version 3.0), a simple C/C++/Fortran compiler extension that allows to add parallelism into existing source code without significantly having to rewrite it.In this document, we concentrate on the GCC compiler and the C++ language in particular. However, other compilers and languages supporting OpenMP also exist.This document covers OpenMP 3.0, but because version 2.5 is more commonly supported than 3.0, we also place emphasis on version 2.5.
Table of contents [expand all] [collapse all]

Preface: Importance of multithreading

As CPU speeds no longer improve as significantly as they did before, multicore systems are becoming more popular.To harness that power, it is becoming important for programmers to be knowledgeable in parallel programming — making a program execute multiple things simultaneously.This document attempts to give a quick introduction to OpenMP, a simple C/C++/Fortran compiler extension that allows to add parallelism into existing source code without significantly having to entirely rewrite it.In this document, we concentrate on the GCC compiler and the C++ language in particular. However, other compilers and languages supporting OpenMP also exist.

Support in different compilers

  • OpenMP 2.5 is supported by the GNU Compiler Collection (GCC) since version 4.2. Add the commandline option -fopenmp to enable it.
  • OpenMP 3.0 is supported by the GNU Compiler Collection (GCC) since version 4.4. Add the commandline option -fopenmp to enable it.
  • OpenMP 2.5 is supported by Microsoft Visual C++ 2005 (cl). Add the commandline option /openmp to enable it.
  • OpenMP 2.5 is supported by Intel C Compiler (icc) since version 10.1. Add the commandline option -openmp to enable it. Add the -openmp-stubs option instead to enable the library without actual parallel execution.
  • OpenMP 3.0 is supported by Intel C Compiler (icc) since version 11.0. Add the commandline option -openmp to enable it. Add the -openmp-stubs option instead to enable the library without actual parallel execution.

Note: If your GCC complains that “-fopenmp” is valid for D but not for C++ when you try to use it, or does not recognize the option at all, your GCC version is too old. If your linker complains about missing GOMP functions, you forgot to specify “-fopenmp” in the linking.More information: http://openmp.org/wp/openmp-compilers/

Introduction to OpenMP in C++

OpenMP consists of a set of compiler #pragmas that control how the program works. The pragmas are designed so that even if the compiler does not support them, the program will still yield correct behavior, but without any parallelism.Here are two simple example programs demonstrating OpenMP.You can compile them like this:
  g++ tmp.cpp -fopenmp

Example: Initializing a table in parallel

  #include <cmath>
  int main()
  {
    const int size = 256;
    double sinTable[size];

    #pragma omp parallel for
    for(int n=0; n<size; ++n)
      sinTable[n] = std::sin(2 * M_PI * n / size);

    // the table is now initialized
  }

Example: Calculating the Mandelbrot fractal in parallel

This program calculates the classic Mandelbrot fractal at a low resolution and renders it with ASCII characters, calculating multiple pixels in parallel.
 #include <complex>
 #include <cstdio>

 typedef std::complex<double> complex;

 int MandelbrotCalculate(complex c, int maxiter)
 {
     // iterates z = z² + c until |z| >= 2 or maxiter is reached,
     // returns the number of iterations.
     complex z = c;
     int n=0;
     for(; n<maxiter; ++n)
     {
         if( std::abs(z) >= 2.0) break;
         z = z*z + c;
     }
     return n;
 }
 int main()
 {
     const int width = 78, height = 44, num_pixels = width*height;

     const complex center(-.7, 0), span(2.7, -(4/3.0)*2.7*height/width);
     const complex begin = center-span/2.0, end = center+span/2.0;
     const int maxiter = 100000;

   #pragma omp parallel for ordered schedule(dynamic)
     for(int pix=0; pix<num_pixels; ++pix)
     {
         const int x = pix%width, y = pix/width;

         complex c = begin + complex(x * span.real() / (width +1.0),
                                     y * span.imag() / (height+1.0));

         int n = MandelbrotCalculate(c, maxiter);
         if(n == maxiter) n = 0;

       #pragma omp ordered
         {
           char c = ' ';
           if(n > 0)
           {
               static const char charset[] = ".,c8M@jawrpogOQEPGJ";
               c = charset[n % (sizeof(charset)-1)];
           }
           std::putchar(c);
           if(x+1 == width) std::puts("|");
         }
     }
 }

(There are a multitude of ways this program can be improved but that is beside the point.)

Discussion

As you can see, there is very little in the program that indicates that it runs in parallel. If you remove the #pragma lines, the result is still a valid C++ program that runs and does the expected thing.Only when the compiler interprets those #pragma lines, it becomes a parallel program. It really does calculate N values simultaneously where N is the number of threads. In GCC, libgomp determines that from the number of processors.By C and C++ standards, if the compiler encounters a #pragma that it does not support, it will ignore it. So adding the OMP statements can be done safely[1] without breaking compatibility with legacy compilers.There is also a runtime library that can be accessed through omp.h, but it is less often needed. If you need it, you can check the #define _OPENMP for conditional compilation in case of compilers that don’t support OpenMP.[1]: Within the usual parallel programming issues (concurrency, mutual exclusion) of course.

The syntax

All OpenMP directives in C and C++ are indicated with a #pragma omp followed by parameters, ending in a newline. The pragma usually applies only into the statement immediately following it, except for the barrier and flush commands, which do not have associated statements.

The parallel pragma

The parallel pragma starts a parallel block. It creates a team of N threads (where N is determined at runtime, usually from the number of CPU cores, but may be affected by a few things), all of which execute the next statement (or the next block, if the statement is a {…} -enclosure). After the statement, the threads join back into one.
  #pragma omp parallel
  {
    // Code inside this region runs in parallel.
    printf("Hello!\n");
  }

This example prints the text “Hello!” followed by a newline as many times as there are threads in the team created. For a dual-core system, it will output the text twice. (Note: It may also output something like “HeHlellolo”, depending on system, because the printing happens in parallel.)Internally, GCC implements this by creating a magic function and moving the associated code into that function, so that all the variables declared within that block become local variables of that function (and thus, locals to each thread).
ICC, on the other hand, uses a mechanism resembling fork(), and does not create a magic function. Both implementations are, of course, valid, and semantically identical.Variables shared from the context are handled transparently, sometimes by passing a reference and sometimes by using register variables which are flushed at the end of the parallel block (or whenever a flush is executed).The parallelism can be made conditional by including a if clause in the parallel command, such as:

  extern int parallelism_enabled;
  #pragma omp parallel for if(parallelism_enabled)
  for(int c=0; c<n; ++c)
    handle(c);

In this case, if parallelism_enabled evaluates to a zero value, the number of threads in the team that processes the for loop will always be exactly one.

Loop directive: for

The for directive splits the for-loop so that each thread in the current team handles a different portion of the loop.
 #pragma omp for
 for(int n=0; n<10; ++n)
 {
   printf(" %d", n);
 }
 printf(".\n");

This loop will output each number from 0…9 once. However, it does not do so necessarily in order. It may output, for example:

  0 5 6 7 1 8 2 3 4 9.

Internally, the above loop becomes into code equivalent to this:

  int this_thread = omp_get_thread_num(), num_threads = omp_get_num_threads();
  int my_start = (this_thread  ) * 10 / num_threads;
  int my_end   = (this_thread+1) * 10 / num_threads;
  for(int n=my_start; n<my_end; ++n)
    printf(" %d", n);

So each thread gets a different section of the loop, and they execute their own sections in parallel.Note: #pragma omp for only delegates portions of the loop for different threads in the current team. A team is the group of threads executing the program. At program start, the team consists only of a single member: the master thread that runs the program.To create a new team of threads, you need to specify the parallel keyword. It can be specified in the surrounding context:

 #pragma omp parallel
 {
  #pragma omp for
  for(int n=0; n<10; ++n) printf(" %d", n);
 }
 printf(".\n");

Equivalent shorthand is to specify it in the pragma itself, as #pragma omp parallel for:

 #pragma omp parallel for
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

You can explicitly specify the number of threads to be created in the team, using the num_threads attribute:

 #pragma omp parallel num_threads(3)
 {
   // This code will be executed by three threads.

   // Chunks of this loop will be divided amongst
   // the (three) threads of the current team.
   #pragma omp for
   for(int n=0; n<10; ++n) printf(" %d", n);
 }

Note that OpenMP also works for C. However, in C, you need to set explicitly the loop variable as private, because C does not allow declaring it in the loop body:

 int n;
 #pragma omp for private(n)
 for(n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

See the “private and shared clauses” section for details.In OpenMP 2.5, the iteration variable in for must be a signed integer variable type. In OpenMP 3.0, it may also be an unsigned integer variable type, a pointer type or a constant-time random access iterator type. In the latter case, std::distance() will be used to determine the number of loop iterations.

What are: parallel, for and a team

The difference between parallel, parallel for and for is as follows:
  • A team is the group of threads that execute currently.
    • At the program beginning, the team consists of a single thread.
    • A parallel directive splits the current thread into a new team of threads for the duration of the next block/statement, after which the team merges back into one.
  • for divides the work of the for-loop among the threads of the current team. It does not create threads, it only divides the work amongst the threads of the currently executing team.
  • parallel for is a shorthand for two commands at once: parallel and for. Parallel creates a new team, and for splits that team to handle different portions of the loop.

If your program never contains a parallel directive, there is never more than one thread; the master thread that starts the program and runs it, as in non-threading programs.

Scheduling

The scheduling algorithm for the for-loop can explicitly controlled.
 #pragma omp for schedule(static)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

static is the default schedule as shown above. Upon entering the loop, each thread independently decides which chunk of the loop they will process.There is also the dynamic schedule:

 #pragma omp for schedule(dynamic)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

In the dynamic schedule, there is no predictable order in which the loop items are assigned to different threads. Each thread asks the OpenMP runtime library for an iteration number, then handles it, then asks for next, and so on. This is most useful when used in conjunction with the ordered clause, or when the different iterations in the loop may take different time to execute.The chunk size can also be specified to lessen the number of calls to the runtime library:

 #pragma omp for schedule(dynamic, 3)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

In this example, each thread asks for an iteration number, executes 3 iterations of the loop, then asks for another, and so on. The last chunk may be smaller than 3, though.Internally, the loop above becomes into code equivalent to this (illustration only, do not write code like this):

  int a,b;
  if(GOMP_loop_dynamic_start(0,10,1, 3, &a,&b))
  {
    do {
      for(int n=a; n<b; ++n) printf(" %d", n);
    } while(GOMP_loop_dynamic_next(&a,&b));
  }

The guided schedule appears to have behavior of static with the shortcomings of static fixed with dynamic-like traits. It is difficult to explain — this example program maybe explains it better than words. (Requires libSDL to compile.)

Ordering

The order in which the loop iterations are executed is unspecified, and depends on runtime conditions.However, it is possible to force that certain events within the loop happen in a predicted order, using the ordered clause.
 #pragma omp for ordered schedule(dynamic)
 for(int n=0; n<100; ++n)
 {
   files[n].compress();

   #pragma omp ordered
   send(files[n]);
 }

This loop “compresses” 100 files with some files being compressed in parallel, but ensures that the files are “sent” in a sequential order.If the thread assigned to compress file 7 is done but the file 6 has not yet been sent, the thread will wait before sending, and before starting to compress another file. The ordered clause in the loop guarantees that there always exists one thread that is handling the lowest-numbered unhandled task.Each file is compressed and sent exactly once, but the compression may happen in parallel.There may only be one ordered block per an ordered loop, no less and no more. In addition, the enclosing for directive must contain the ordered clause.

Sections

Sometimes it is handy to indicate that “this and this can run in parallel”. The sections setting is just for that.
 #pragma omp sections
 {
   { Work1(); }
   #pragma omp section
   { Work2();
     Work3(); }
   #pragma omp section
   { Work4(); }
 }

This code indicates that any of the tasks Work1, Work2 + Work3 and Work4 may run in parallel, but that Work2 and Work3 must be run in sequence. Each work is done exactly once.As usual, if the compiler ignores the pragmas, the result is still a correctly running program.Internally, GCC implements this as a combination of the parallel for and a switch-case construct. Other compilers may implement it differently.Note: #pragma omp sections only delegates the sections for different threads in the current team. To create a team, you need to specify the parallel keyword either in the surrounding context or in the pragma, as #pragma omp parallel sections.
Example:

 #pragma omp parallel sections // starts a new team
 {
   { Work1(); }
   #pragma omp section
   { Work2();
     Work3(); }
   #pragma omp section
   { Work4(); }
 }

or

 #pragma omp parallel // starts a new team
 {
   //Work0(); // this function would be run by all threads.

   #pragma omp sections // divides the team into sections
   {
     // everything herein is run only once.
     { Work1(); }
     #pragma omp section
     { Work2();
       Work3(); }
     #pragma omp section
     { Work4(); }
   }

   //Work5(); // this function would be run by all threads.
 }

The task directive (OpenMP 3.0 only)

When for and sections are too cumbersome, the task directive can be used. This is only supported in OpenMP 3.0.These examples are from the OpenMP 3.0 manual:
struct node { node *left, *right; };
extern void process(node* );
void traverse(node* p)
{
    if (p->left)
        #pragma omp task // p is firstprivate by default
        traverse(p->left);
    if (p->right)
        #pragma omp task // p is firstprivate by default
        traverse(p->right);
    process(p);
}

In the next example, we force a postorder traversal of the tree by adding a taskwait directive. Now, we can safely assume that the left and right sons have been executed before we process the current node.

struct node { node *left, *right; };
extern void process(node* );
void postorder_traverse(node* p)
{
    if (p->left)
        #pragma omp task // p is firstprivate by default
        postorder_traverse(p->left);
    if (p->right)
        #pragma omp task // p is firstprivate by default
        postorder_traverse(p->right);
    #pragma omp taskwait
    process(p);
}

The following example demonstrates how to use the task construct to process elements of a linked list in parallel. The pointer p is firstprivate by default on the task construct so it is not necessary to specify it in a firstprivate clause.

struct node { int data; node* next; };
extern void process(node* );
void increment_list_items(node* head)
{
    #pragma omp parallel
    {
        #pragma omp single
        {
            for(node* p = head; p; p = p->next)
            {
            	#pragma omp task
            	process(p); // p is firstprivate by default
            }
        }
    }
}

Thread-safety (i.e. mutual exclusion)

There are a wide array of concurrency and mutual exclusion problems related to multithreading programs. I won’t explain them here in detail; there are many good books dealing with the issue. (For example, Multithreaded, Parallel, and Distributed Programming by Gregory R. Andrews.)Instead, I will explain the tools that OpenMP provides to handle mutual exclusion correctly.

Atomicity

Atomicity means that something is inseparable; an event either happens completely or it does not happen at all, and another thread cannot intervene during the execution of the event.
 #pragma omp atomic
 counter += value;

The atomic keyword in OpenMP specifies that the denoted action happens atomically. It is commonly used to update counters and other simple variables that are accessed by multiple threads simultaneously.Unfortunately, the atomicity setting can only be specified to simple expressions that usually can be compiled into a single processor opcode, such as increments, decrements, xors and the such. For example, it cannot include function calls, array indexing, overloaded operators, non-POD types, or multiple statements. Furthermore, it only specifies atomicity for the action relating to the lefthand side of the operation; it does not guarantee that the expression on the right side of the operator is evaluated atomically. If you need to atomicise more complex constructs, use either the critical directive or the locks, as explained below.See also reduction.

The critical directive

The critical directive restricts the execution of the associated statement / block to a single thread at time.The critical directive may optionally contain a global name that identifies the type of the critical directive. No two threads can execute a critical directive of the same name at the same time.If the name is omitted, a default name is assumed.
 #pragma omp critical(dataupdate)
 {
   datastructure.reorganize();
 }
 ...
 #pragma omp critical(dataupdate)
 {
   datastructure.reorganize_again();
 }

In this example, only one of the critical sections named “dataupdate” may be executed at any given time, and only one thread may be executing it at that time. I.e. the functions “reorganize” and “reorganize_again” cannot be invoked at the same time, and two calls to the function cannot be active at the same time. (Except if other calls exist elsewhere, unprotected by the critical directive.)Note: The critical section names are global to the entire program (regardless of module boundaries). So if you have a critical section by the same name in multiple modules, not two of them can be executed at the same time.If you need something like a local mutex, see below.

Locks

The OpenMP runtime library provides a lock type, omp_lock_t in its include file, omp.h.The lock type has five manipulator functions:
  • omp_init_lock initializes the lock. After the call, the lock is unset.
  • omp_destroy_lock destroys the lock. The lock must be unset before this call.
  • omp_set_lock attempts to set the lock. If the lock is already set by another thread, it will wait until the lock is no longer set, and then sets it.
  • omp_unset_lock unsets the lock. It should only be called by the same thread that set the lock; the consequences of doing otherwise are undefined.
  • omp_test_lock attempts to set the lock. If the lock is already set by another thread, it returns 0; if it managed to set the lock, it returns 1.

Here is an example of a wrapper around std::set<> that provides per-instance mutual exclusion while still working even if the compiler does not support OpenMP.You can maintain backward compability with non-OpenMP-supporting compilers by enclosing the library references in #ifdef _OPENMP#endif blocks.

 #ifdef _OPENMP
 # include <omp.h>
 #endif
 #include <set>

 class data
 {
 private:
   std::set<int> flags;
 #ifdef _OPENMP
   omp_lock_t lock;
 #endif
 public:
   data() : flags()
   {
 #ifdef _OPENMP
     omp_init_lock(&lock);
 #endif
   }
   ~data()
   {
 #ifdef _OPENMP
     omp_destroy_lock(&lock);
 #endif
   }

   bool set_get(int c)
   {
   #ifdef _OPENMP
     omp_set_lock(&lock);
   #endif
     bool found = flags.find(c) != flags.end();
     if(!found) flags.insert(c);
   #ifdef _OPENMP
     omp_unset_lock(&lock);
   #endif
     return found;
   }
 };

Of course, you would really rather wrap the lock into a custom container to avoid littering the code with #ifdefs and also for providing exception-safety:

 #ifdef _OPENMP
 # include <omp.h>
 struct MutexType
 {
   MutexType() { omp_init_lock(&lock); }
   ~MutexType() { omp_destroy_lock(&lock); }
   void Lock() { omp_set_lock(&lock); }
   void Unlock() { omp_unset_lock(&lock); }

   MutexType(const MutexType& ) { omp_init_lock(&lock); }
   MutexType& operator= (const MutexType& ) { return *this; }
 public:
   omp_lock_t lock;
 };
 #else
 /* A dummy mutex that doesn't actually exclude anything,
  * but as there is no parallelism either, no worries. */
 struct MutexType
 {
   void Lock() {}
   void Unlock() {}
 };
 #endif

 /* An exception-safe scoped lock-keeper. */
 struct ScopedLock
 {
   explicit ScopedLock(MutexType& m) : mut(m), locked(true) { mut.Lock(); }
   ~ScopedLock() { Unlock(); }
   void Unlock() { if(!locked) return; locked=false; mut.Unlock(); }
   void LockAgain() { if(locked) return; mut.Lock(); locked=true; }
 private:
   MutexType& mut;
   bool locked;
 private: // prevent copying the scoped lock.
   void operator=(const ScopedLock&);
   ScopedLock(const ScopedLock&);
 };

This way, the example above becomes a lot simpler:

 #include <set>

 class data
 {
 private:
   std::set<int> flags;
   MutexType lock;
 public:
   bool set_get(int c)
   {
     ScopedLock lck(lock); // locks the mutex

     if(flags.find(c) != flags.end()) return true; // was found
     flags.insert(c);
     return false; // was not found
   } // automatically releases the lock when lck goes out of scope.
 };

There is also a lock type that supports nesting, omp_nest_lock_t. I won’t cover it here.

The flush directive

Even when variables used by threads are supposed to be shared, the compiler may take liberties and optimize them as register variables. This can skew concurrent observations of the variable. The flush directive can be used to ensure that the value observed in one thread is also the value observed by other threads.This example comes from the OpenMP specification.
          /* presumption: int a = 0, b = 0; */

    /* First thread */                /* Second thread */
    b = 1;                            a = 1;
    #pragma omp flush(a,b)            #pragma omp flush(a,b)
    if(a == 0)                        if(b == 0)
    {                                 {
      /* Critical section */            /* Critical section */
    }                                 }

In this example, it is enforced that at the time either of a or b is accessed, the other is also up-to-date, practically ensuring that not both of the two threads enter the critical section. (Note: It is still possible that neither of them can enter it.)You need the flush directive when you have writes to and reads from the same data in different threads.If the program appears to work correctly without the flush directive, it does not mean that the flush directive is not required. It just may be that your compiler is not utilizing all the freedoms the standard allows it to do. You need the flush directive whenever you access shared data in multiple threads: After a write, before a read.However, I do not know these:

  • Is flush needed if the shared variable is declared volatile?
  • Is flush needed if all access to the shared variable is atomic or restricted by critical sections?

Controlling which data to share between threads

In the parallel section, it is possible to specify which variables are shared between the different threads and which are not. By default, all variables are shared except those declared within the parallel block.

The private, firstprivate and shared clauses

 int a, b=0;
 #pragma omp parallel for private(a) shared(b)
 for(a=0; a<50; ++a)
 {
   #pragma omp atomic
   b += a;
 }

This example explicitly specifies that a is private (each thread has their own copy of it) and that b is shared (each thread accesses the same variable).

The difference between private and firstprivate

Note that a private copy is an uninitialized variable by the same name and same type as the original variable; it does not copy the value of the variable that was in the surrounding context.Example:
 #include <string>
 #include <iostream>

 int main()
 {
     std::string a = "x", b = "y";
     int c = 3;

     #pragma omp parallel private(a,c) shared(b) num_threads(2)
     {
         a += "k";
         c += 7;
         std::cout << "A becomes (" << a << "), b is (" << b << ")\n";
     }
 }

This will output the string “k”, not “xk”. At the entrance of the block, a becomes a new instance of std::string, that is initialized with the default constructor; it is not initialized with the copy constructor.Internally, the program becomes like this:

 int main()
 {
     std::string a = "x", b = "y";
     int c = 3;

     OpenMP_thread_fork(2);
     {                  // Start new scope
         std::string a; // Note: It is a new local variable.
         int c;         // This too.
         a += "k";
         c += 7;
         std::cout << "A becomes (" << a << "), b is (" << b << ")\n";
     }                  // End of scope for the local variables
     OpenMP_join();
 }

In the case of primitive (POD) datatypes (int, float, char* etc.), the private variable is uninitialized, just like any declared but not initialized local variable. It does not contain the value of the variable from the surrounding context. Therefore, the increment of c is moot here; the value of the variable is still undefined. (Unfortunately GCC, as of version 4.2.1, does not yet warn about uninitialized values in situations like this.)If you actually need a copy of the original value, use the firstprivate clause instead.

 #include <string>
 #include <iostream>

 int main()
 {
     std::string a = "x", b = "y";
     int c = 3;

     #pragma omp parallel firstprivate(a,c) shared(b) num_threads(2)
     {
         a += "k";
         c += 7;
         std::cout << "A becomes (" << a << "), b is (" << b << ")\n";
     }
 }

Now the output becomes “A becomes (xk), b is (y)”.

The lastprivate clause

The lastprivate clause defines a variable private as in firstprivate or private, but causes the value from the last task to be copied back to the original value after the end of the loop/sections construct.
  • In a loop construct (for directive), the last value is the value assigned by the thread that handles the last iteration of the loop. Values assigned during other iterations are ignored.
  • In a sections construct (sections directive), the last value is the value assigned in the last section denoted by the section directive. Values assigned in other sections are ignored.

Example:

 #include <stdio.h>
 int main()
 {
    int done = 4, done2 = 5;

     #pragma omp parallel for lastprivate(done, done2) num_threads(2) schedule(static)
     for(int a=0; a<8; ++a)
     {
       if(a==2) done=done2=0;
       if(a==3) done=done2=1;
     }
     printf("%d,%d\n", done,done2);
 }

This program outputs “4196224,-348582208″, because internally, this program became like this:

 #include <stdio.h>
 int main()
 {
    int done = 4, done2 = 5;
    OpenMP_thread_fork(2);
    {
        int this_thread = omp_get_thread_num(), num_threads = 2;
        int my_start = (this_thread  ) * 8 / num_threads;
        int my_end   = (this_thread+1) * 8 / num_threads;

        int priv_done, priv_done2; // not initialized, because firstprivate was not used

        for(int a=my_start; a<my_end; ++a)
        {
            if(a==2) priv_done=priv_done2=0;
            if(a==3) priv_done=priv_done2=1;
        }
        if(my_end == 8)
        {
           // assign the values back, because this was the last iteration
           done  = priv_done;
           done2 = priv_done2;
        }
    }
    OpenMP_join();
 }

As one can observe, the values of priv_done and priv_done2 are not assigned even once during the course of the loop that iterates through 4…7. As such, the values that are assigned back are completely bogus.Therefore, lastprivate cannot be used to e.g. fetch the value of a flag assigned randomly during a loop. Use reduction for that, instead.Where this behavior can be utilized though, is in situations like this (from OpenMP manual):

 void loop()
 {
   int i;
   #pragma omp for lastprivate(i)
   for(i=0; i<get_loop_count(); ++i) // note: get_loop_count() must be a pure function.
       { ... }

   printf("%d\n", i); // this shows the number of loop iterations done.
 }

The default clause

The most useful purpose on the default clause is to check whether you have remembered to consider all variables for the private/shared question, using the default(none) setting.
 int a, b=0;
 // This code won't compile: It will require that
 // it will be explicitly specified whether a is shared or private.
 #pragma omp parallel default(none) shared(b)
 {
   b += a;
 }

The default clause can also be used to set that all variables are shared by default (default(shared)).

Note: Because different compilers have different ideas about which variables are implicitly private or shared, and for which it is an error to explicitly state the private/shared status, it is recommended to use the default(none) setting only during development, and drop it in production/distribution code.

The reduction clause

The reduction clause is a mix between the private, shared, and atomic clauses.
It allows to accumulate a shared variable without the atomic clause, but the type of accumulation must be specified. It will often produce faster executing code than by using the atomic clause.This example calculates factorial using threads:
 int factorial(int number)
 {
   int fac = 1;
   #pragma omp parallel for reduction(*:fac)
   for(int n=2; n<=number; ++n)
     fac *= n;
   return fac;
 }
  • At the beginning of the parallel block, a private copy is made of the variable and preinitialized to a certain value .
  • At the end of the parallel block, the private copy is atomically merged into the shared variable using the defined operator.

(The private copy is actually just a new local variable by the same name and type; the original variable is not accessed to create the copy.)The syntax of the clause is:

  reduction(operator:list)

where list is the list of variables where the operator will be applied to, and operator is one of these:

Operator Initialization value
+, -, |, ^, || 0
*, && 1
& ~0

To write the factorial function (shown above) without reduction, it probably would look like this:

 int factorial(int number)
 {
   int fac = 1;
   #pragma omp parallel for
   for(int n=2; n<=number; ++n)
   {
     #pragma omp atomic
     fac *= n;
   }
   return fac;
 }

However, this code would be less optimal than the one with reduction: it misses the opportunity to use a local (possible register) variable for the cumulation, and needlessly places load/synchronization demands on the shared memory variable. In fact, due to the bottleneck of that atomic variable (only one thread may access it simultaneously), it would completely nullify the gains of parallelism in that loop.The version with reduction is equivalent to this code (illustration only):

 int factorial(int number)
 {
   int fac = 1;
   #pragma omp parallel
   {
     int fac_private = 1; /* This value comes from the table shown above */
     #pragma omp for nowait
     for(int n=2; n<=number; ++n)
       fac_private *= n;
     #pragma omp atomic
     fac *= fac_private;
   }
   return fac;
 }

Note how it moves the atomic operation out from the loop.The restrictions in reduction and atomic are very similar: both can only be done on POD types; neither allows overloaded operators, and both have the same set of supported operators.As an example of how the reduction clause can be used to produce semantically different code when OpenMP is enabled and when it is disabled, this example prints the number of threads that executed the parallel block:

 int a = 0;
 #pragma omp parallel reduction (+:a)
 {
   a = 1; // Assigns a value to the private copy.
   // Each thread increments the value by 1.
 }
 printf("%d\n", a);

If you preinitialized “a” to 4, it would print a number >= 5 if OpenMP was enabled, and 1 if OpenMP was disabled.
Note: If you really need to detect whether OpenMP is enabled, use the _OPENMP #define instead. To get the number of threads, use omp_get_num_threads() instead.

Execution synchronization

The barrier directive and the nowait clause

The barrier directive causes threads encountering the barrier to wait until all the other threads in the same team have encountered the barrier.
 #pragma omp parallel
 {
   /* All threads execute this. */
   SomeCode();

   #pragma omp barrier

   /* All threads execute this, but not before
    * all threads have finished executing SomeCode().
    */
   SomeMoreCode();
 }

Note: There is an implicit barrier at the end of each parallel block, and at the end of each sections, for and single statement, unless the nowait directive is used.Example:

 #pragma omp parallel
 {
   #pragma omp for
   for(int n=0; n<10; ++n) Work();

   // This line is not reached before the for-loop is completely finished
   SomeMoreCode();
 }

 // This line is reached only after all threads from
 // the previous parallel block are finished.
 CodeContinues();

 #pragma omp parallel
 {
   #pragma omp for nowait
   for(int n=0; n<10; ++n) Work();

   // This line may be reached while some threads are still executing the for-loop.
   SomeMoreCode();
 }

 // This line is reached only after all threads from
 // the previous parallel block are finished.
 CodeContinues();

The nowait directive can only be attached to sections, for and single. It cannot be attached to the within-loop ordered clause, for example.

The single and master directives

The single directive specifies that the given statement/block is executed by only one thread. It is unspecified which thread. Other threads skip the statement/block and wait at an implicit barrier at the end of the construct.
 #pragma omp parallel
 {
   Work1();
   #pragma omp single
   {
     Work2();
   }
   Work3();
 }

In a 2-cpu system, this will run Work1() twice, Work2() once and Work3() twice. There is an implied barrier at the end of the single directive, but not at the beginning of it.Note: Do not assume that the single block is executed by whichever thread gets there first. According to the standard, the decision of which thread executes the block is implementation-defined, and therefore making assumptions on it is non-conforming. In GCC, the decision is hidden inside the mechanics of libgomp.The master directive is similar, except that the statement/block is run by the master thread, and there is no implied barrier; other threads skip the construct without waiting.

 #pragma omp parallel
 {
   Work1();

   // This...
   #pragma omp master
   {
     Work2();
   }

   // ...is practically identical to this:
   if(omp_get_thread_num() == 0)
   {
     Work2();
   }

   Work3();
 }

Unless you use the threadprivate clause, the only important difference between single nowait and master is that if you have multiple master blocks in a parallel section, you are guaranteed that they are executed by the same thread every time, and hence, the values of private (thread-local) variables are the same.

Loop nesting

The problem

A beginner at OpenMP will quickly find out that this code will not do the expected thing:
 #pragma omp parallel for
 for(int y=0; y<25; ++y)
 {
   #pragma omp parallel for
   for(int x=0; x<80; ++x)
   {
     tick(x,y);
   }
 }

The beginner expects there to be N tick() calls active at the same time (where N = number of processors). Although that is true, the inner loop is not actually parallelised. Only the outer loop is. The inner loop runs in a pure sequence, as if the whole inner #pragma was omitted.At the entrance of the inner parallel directive, the OpenMP runtime library (libgomp in case of GCC) detects that there already exists a team, and instead of a new team of N threads, it will create a team consisting of only the calling thread.Rewriting the code like this won’t work:

 #pragma omp parallel for
 for(int y=0; y<25; ++y)
 {
   #pragma omp for // ERROR, nesting like this is not allowed.
   for(int x=0; x<80; ++x)
   {
     tick(x,y);
   }
 }

This code is erroneous and will cause the program to malfunction. See the restrictions chapter below for details.

Solution in OpenMP 3.0

In OpenMP 3.0, the loop nesting problem can be solved by using the collapse clause in the for directive.Example:
 #pragma omp parallel for collapse(2)
 for(int y=0; y<25; ++y)
   for(int x=0; x<80; ++x)
   {
     tick(x,y);
   }

The number specified in the collapse clauses is the number of nested loops that are subject to the work-sharing semantics of the OpenMP for directive.

Workarounds in OpenMP 2.5

Unfortunately, as of release 4.3, GCC does not yet support OpenMP 3.0. Workarounds exist for OpenMP 2.5.As was shown in the Mandelbrot example near the beginning of this document, the problem can be worked around by turning the nested loop into a non-nested loop:
 #pragma omp parallel for
 for(int pos=0; pos<(25*80); ++pos)
 {
   int x = pos%80;
   int y = pos/80;
   tick(x,y);
 }

However, rewriting code like this can be a nuisance.An alternative is to enable nested threading:

 omp_set_nested(1);
 #pragma omp parallel for
 for(int y=0; y<25; ++y)
 {
   #pragma omp parallel for
   for(int x=0; x<80; ++x)
   {
     tick(x,y);
   }
 }

In this example, the inner loop is also parallelised. However, instead of N threads, the user will find N*N threads running, because the nested parallel directive now starts a new team of N threads instead of starting a single-thread team. This might be highly undesirable, hence why the nesting flag is disabled by default.So it might be wisest to write the code like this:

 for(int y=0; y<25; ++y)
 {
   #pragma omp parallel for
   for(int x=0; x<80; ++x)
   {
     tick(x,y);
   }
 }

Now only the inner loop is run in parallel. In most cases, it is only slightly less optimal than the code that was rewritten to use only one loop.

Performance

One may be worried about the creation of new threads within the inner loop. Worry not, the libgomp in GCC is smart enough to actually only creates the threads once. Once the team has done its work, the threads are returned into a “dock”, waiting for new work to do.In other words, the number of times the clone system call is executed is exactly equal to the maximum number of concurrent threads. The parallel directive is not the same as a combination of pthread_create and pthread_join.There will be lots of locking/unlocking due to the implied barriers, though. I don’t know if that can be reasonably avoided or whether it even should.

Restrictions

There are restrictions for which clauses can be nested under which directives. The restrictions are listed in the OpenMP official specification.

Shortcomings

OpenMP and fork()

It is worth mentioning that using OpenMP in a program that calls fork() requires special consideration.This problem only affects GCC; ICC is not affected.If your program intends to become a background process using daemonize() or other similar means, you must not use the OpenMP features before the fork. After OpenMP features are utilized, a fork is only allowed if the child process does not use OpenMP features, or it does so as a completely new process (such as after exec()).This is an example of an erroneous program:
  #include <stdio.h>
  #include <sys/wait.h>
  #include <unistd.h>

  void a()
  {
    #pragma omp parallel num_threads(2)
    {
      puts("para_a"); // output twice
    }
    puts("a ended"); // output once
  }
  void b()
  {
    #pragma omp parallel num_threads(2)
    {
      puts("para_b");
    }
    puts("b ended");
  }

  int main() {
   a();   // Invokes OpenMP features (parent process)
   int p = fork();
   if(!p)
   {
     b(); // ERROR: Uses OpenMP again, but in child process
     _exit(0);
   }
   wait(NULL);
   return 0;
  }

When run, this program hangs, never reaching the line that outputs “b ended”.There is currently no workaround; the libgomp API does not specify functions that can be used to prepare for a call to fork().

Cancelling of threads (and breaking out of loops)

Suppose that we want to optimize this function with parallel processing:
 /* Returns any position from the haystack where the needle can
  * be found, or NULL if no such position exists. It is not guaranteed
  * to find the first matching position; it only guarantees to find
  * _a_ matching position if one exists.
  */
 const char* FindAnyNeedle(const char* haystack, size_t size, char needle)
 {
   for(size_t p = 0; p < size; ++p)
     if(haystack[p] == needle)
     {
       /* This breaks out of the loop. */
       return haystack+p;
     }
   return NULL;
 }

Our first attempt might be to simply tack a #pragma parallel for before the for loop, but that doesn’t work: OpenMP requires that a loop construct processes each iteration. Breaking out of the loop (using return, goto, break, throw or other means) is not allowed.To solve finder problems where N threads search for a solution and once a solution is found by any thread, all threads end their search, you will need to seek other resolutions:

  • Use pthreads and pthread_cancel() instead of OpenMP.
  • Poll an interrupt flag.

Example of polling an interrupt flag (“done”).

 const char* FindAnyNeedle(const char* haystack, size_t size, char needle)
 {
   const char* result = NULL;
   bool done = false;
   #pragma omp parallel for
   for(size_t p = 0; p < size; ++p)
   {
     #pragma omp flush(done)
     if(!done)
     {
       /* Do work only if no thread has found the needle yet. */
       if(haystack[p] == needle)
       {
         /* Inform the other threads that we found the needle. */
         done = true;
         #pragma omp flush(done)
         result = haystack+p;
       }
     }
   }
   return result;
 }

However, the above version has a performance shortcoming: if a thread detects that the search is over (“done” is true), they will still iterate through their respective chunks of the for loop. You could just as well ignore the whole “done” flag:

 const char* FindAnyNeedle(const char* haystack, size_t size, char needle)
 {
   const char* result = NULL;
   #pragma omp parallel for
   for(size_t p = 0; p < size; ++p)
     if(haystack[p] == needle)
       result = haystack+p;
   return result;
 }

This does not incur any savings in memory access; each position in “haystack” is still searched. Furthermore, the value of “result” may be set multiple times, once for each match. If matches are abundant, it may be a lot slower than the non-parallel version was.The only possibility to avoid the extra loops is to avoid the OpenMP for directive alltogether and write it manually:

 const char* FindAnyNeedle(const char* haystack, size_t size, char needle)
 {
   const char* result = NULL;
   bool done = false;
   #pragma omp parallel
   {
     int this_thread = omp_get_thread_num(), num_threads = omp_get_num_threads();
     size_t beginpos = (this_thread+0) * size / num_threads;
     size_t endpos   = (this_thread+1) * size / num_threads;
     // watch out for overflow in that multiplication.

     for(size_t p = beginpos; p < endpos; ++p)
     {
       /* End loop if another thread already found the needle. */
       #pragma omp flush(done)
       if(done) break;

       if(haystack[p] == needle)
       {
         /* Inform the other threads that we found the needle. */
         done = true;
         #pragma omp flush(done)
         result = haystack+p;
         break;
       }
     }
   }
   return result;
 }

In this version, no extra iterations are done, but there is the fact that the “done” variable is polled at each loop. If that bothers you, you can seek a middle ground by dividing the loop into shorter sections where you don’t poll the interrupt flag, and only checking it in the between of them:

 const char* FindAnyNeedle(const char* haystack, size_t size, char needle)
 {
   const char* result = NULL;
   bool done = false;
   size_t beginpos = 0, n_per_loop = 4096 * omp_get_max_threads();
   while(beginpos < size && !done)
   {
     size_t endpos = std::min(size, beginpos + n_per_loop);
     #pragma omp parallel for reduction(|:done)
     for(size_t p = beginpos; p < endpos; ++p)
       if(haystack[p] == needle)
       {
         /* Found it. */
         done = true;
         result = haystack+p;
       }
     beginpos = endpos;
   }
   return result;
 }

This however again does not guarantee that no extra comparisons are done; the value of “result” may be set multiple times in this function.If this is also unacceptable, then OpenMP has no more solutions remaining. You can use pthreads and try pthread_create to create the team and pthread_cancel to topple the sibling threads. Such a solution is not portable and is out of the scope of this article. (It can however be downloaded here.)

Some specific gotchas

C++
  • STL is not thread-safe. If you use STL containers in a parallel context, you must exclude concurrent access using locks or other mechanisms. Const-access is usually fine, as long as non-const access does not occur at the same time.
  • Exceptions may not be thrown and caught across omp directives. That is, if a code inside an omp for throws an exception, the exception must be caught before the end of the loop iteration; and an exception thrown inside a parallel section must be caught by the same thread before the end of the parallel section.

C

  • C does not allow variables to be declared in the for syntax, which means that by default, variables used for for iteration become shared. You must use the private clause explictly for loop variables.

GCC

  • fork() is troublematic when used together with OpenMP. See the chapter “OpenMP and fork()” above for details.

Other

For the data copying clauses such as threadprivate, copyprivate and copyin, see the official specification.
Posted in Computer Science, openMP | Leave a comment

Marginal distribution

In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained. These terms are dubbed “marginal” because they used to be found by summing values in a table along rows or columns, and writing the sum in the margins of the table.[1] The distribution of the marginal variables (the marginal distribution) is obtained by marginalizing over the distribution of the variables being discarded, and the discarded variables are said to have been marginalized out.

The context here is that the theoretical studies being undertaken, or the data analysis being done, involves a wider set of random variables but that attention is being limited to a reduced number of those variables. In many applications an analysis may start with a given collection of random variables, then first extend the set by defining new ones (such as the sum of the original random variables) and finally reduce the number by placing interest in the marginal distribution of a subset (such as the sum). Several different analyses may be done, each treating a different subset of variables as the marginal variables.

Contents

 

  • 1 Two-variable case
  • 2 Real-world example
  • 3 General cases
  • 4 See also
  • 5 References
  • 6 Bibliography

Two-variable case

Joint and marginal distributions of a pair of discrete, random variables X,Y having nonzero mutual information I(X; Y).

Given two random variables X and Y whose joint distribution is known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y. This is typically calculated by summing or integrating the joint probability distribution over Y.

For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is

\Pr(X=x) = \sum_{y} \Pr(X=x,Y=y) = \sum_{y} \Pr(X=x|Y=y) \Pr(Y=y),

where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y. In this case, the variable Y has been marginalized out.

Bivariate marginal and joint probabilities for discrete random variables are often displayed as two-way tables.

Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is

p_{X}(x) = \int_y p_{X,Y}(x,y) \, dy = \int_y p_{X|Y}(x|y) \, p_Y(y) \, dy ,

where pX,Y(x,y) gives the joint distribution of X and Y, while pX|Y(x|y) gives the conditional distribution for X given Y. Again, the variable Y has been marginalized out.

Real-world example

Imagine for example you want to compute the probability that a pedestrian will be hit by a car while crossing the road at a pedestrian crossing. Let H be a discrete random variable describing the probability of being hit by a car while walking over the crossing, taking one value from { hit, not hit }. Let L be a discrete random variable describing the probability of the cross traffic’s traffic light state at a given moment, taking one from { red, yellow, green }.

Realistically, H will be dependent on L. That is, P(H = hit) and P(H = not hit) will take different values depending on whether L is red, yellow or green. You are, for example, far more likely to be hit by a car if you try to cross while the lights for cross traffic are green than if they are red. In other words, for any given possible pair of values for H and L, you must feed them into the joint probability distribution of H and L to find the probability of that pair of events occurring together.

However, in trying to calculate the marginal probability P(H=hit), what we are asking for is the probability that H=hit, where we don’t actually know the particular value of L. In general you can be hit if the lights are red OR if the lights are yellow OR if the lights are green. So in this case the answer for the marginal probability can be found by summing P(H,L) = P(hit,L) for all possible values of L.

Here is a table showing the conditional probabilities of being hit, depending on the state of the lights. (Note that due to the dependence, only the columns in this table must add up to 1).

Conditional distribution: P(H|L)
  L=Red L=Yellow L=Green
H=Not Hit 0.99 0.9 0.2
H=Hit 0.01 0.1 0.8

To find the joint probability distribution, we need more data. Let’s say that P(L=red) = 0.7, P(L=yellow) = 0.1, P(L=green) = 0.2. Multiplying the columns in the conditional distribution by the appropriate values, we find the joint probability distribution of H and L. (Note that the cells in this table, excluding the marginal probabilities, now add up to 1).

Joint distribution: P(H,L)
  L=Red L=Yellow L=Green Marginal probability
H=Not Hit 0.693 0.09 0.04 0.823
H=Hit 0.007 0.01 0.16 0.177
Total 0.7 0.1 0.2 1

The marginal probability P(H=Hit) is the sum of the bottom row (above the totals row), as this is the probability of being hit when the lights are red OR yellow OR green. Similarly, the marginal probability that P(H=Not Hit) is the sum of the top row. It is important to interpret this result correctly. The chance of being hit by a car when you cross the road is obviously a lot less than 17.7%. However, what this figure is actually saying is that if you were to ignore the state of the traffic lights and cross the road no matter their color, you would have a 17.7% risk of being hit by a car. This seems more reasonable.

General cases

For multivariate distributions, formulae similar to those above apply with the symbols X and/or Y being interpreted as vectors. In particular, each summation or integration would be over all variables except those contained in X.

See also

References

  1. ^ Trumpler and Weaver (1962), pp. 32–33.

Bibliography

  • Everitt, B. S. (2002). The Cambridge Dictionary of Statistics. Cambridge University Press. ISBN 0-521-81099-x
  • Trumpler, Robert J. and Harold F. Weaver (1962). Statistical
Posted in Business, Language of Business | Leave a comment

Joint probability distribution

In the study of probability, given two random variables X and Y that are defined on the same probability space, the joint distribution for X and Y defines the probability of events defined in terms of both X and Y. In the case of only two random variables, this is called a bivariate distribution, but the concept generalizes to any number of random variables, giving a multivariate distribution.

Contents

 

  • 1 Cumulative distribution
  • 2 Discrete case
  • 3 Continuous case
  • 4 Mixed case
  • 5 General multidimensional distributions
  • 6 Joint distribution for independent variables
  • 7 See also
  • 8 External links

Cumulative distribution

The cumulative distribution function for a pair of random variables is defined in terms of their joint probability distribution;

F(x,y)=P(X \le x, Y \le y) .

Discrete case

For discrete random variables, the joint probability mass function is

 \begin{align} \mathrm{P}(X=x\ \mathrm{and}\ Y=y) & {} = \mathrm{P}(Y=y \mid X=x) \cdot \mathrm{P}(X=x) \\ & {} = \mathrm{P}(X=x \mid Y=y) \cdot \mathrm{P}(Y=y). \end{align}

Since these are probabilities, we have

\sum_x \sum_y \mathrm{P}(X=x\ \mathrm{and}\ Y=y) = 1.\;

Continuous case

Similarly for continuous random variables, the joint probability density function can be written as fX,Y(xy) and this is

f_{X,Y}(x,y) = f_{Y|X}(y|x)f_X(x) = f_{X|Y}(x|y)f_Y(y)\;

where fY|X(y|x) and fX|Y(x|y) give the conditional distributions of Y given X = x and of X given Y = y respectively, and fX(x) and fY(y) give the marginal distributions for X and Y respectively.

Again, since these are probability distributions, one has

\int_x \int_y f_{X,Y}(x,y) \; dy \; dx= 1.

Mixed case

In some situations X is continuous but Y is discrete. For example, in a logistic regression, one may wish to predict the probability of a binary outcome Y conditional on the value of a continuously-distributed X. In this case, (X, Y) has neither a probability density function nor a probability mass function in the sense of the terms given above. On the other hand, a “mixed joint density” can be defined in either of two ways:

 \begin{align} f_{X,Y}(x,y) &= f_{X|Y}(x|y)\mathrm{P}(Y=y)\\              &= \mathrm{P}(Y=y \mid X=x) f_X(x) \end{align}

Formally, fX,Y(x, y) is the probability density function of (X, Y) with respect to the product measure on the respective supports of X and Y. Either of these two decompositions can then be used to recover the joint cumulative distribution function:

 \begin{align} F_{X,Y}(x,y)&=\sum\limits_{t\le y}\int_{s=-\infty}^x f_{X,Y}(s,t)\;ds \end{align}

The definition generalizes to a mixture of arbitrary numbers of discrete and continuous random variables.

General multidimensional distributions

The cumulative distribution function for a vector of random variables is defined in terms of their joint probability distribution;

F(x_1,\dots,x_n)=P(X_1 \le x_1,\dots, X_n \le x_n) .

The joint distribution for two random variables can be extended to many random variables X1, … Xn by adding them sequentially with the identity

\begin{align} f_{X_1, \ldots X_n}(x_1, \ldots x_n) =& f_{X_n | X_1, \ldots X_{n-1}}( x_n | x_1, \ldots x_{n-1}) f_{X_1, \ldots X_{n-1}}( x_1, \ldots x_{n-1} )\\ =& f_{X_1} (x_1) \\  & \cdot f_{X_2|X_1} (x_2|x_1)\\  & \cdot \dots \\  & \cdot f_{X_{n-1}| X_1 \ldots X_{n-2}}(x_{n-1}| x_1, \ldots x_{n-2} ) \\  & \cdot f_{X_n | X_1, \ldots X_{n-1}}( x_n | x_1, \ldots x_{n-1}),\end{align}

where

\begin{align} f_{X_i| X_1, \ldots X_{i-1}}(x_i | x_1, \ldots x_{i-1})=   &\frac{f_{X_1, \dots X_i}(x_1,\dots x_i)}{\int f_{X_1, \dots X_i}(x_1,\dots x_{i-1},u_i) \mathrm{d} u_i}\\ = &\frac{\int \dots \int f_{X_1, \dots X_n}(x_1,\dots x_i,u_{i+1}, \dots u_n) \mathrm{d} u_{i+1}\dots \mathrm{d}u_n}{\int \dots \int \int f_{X_1, \dots X_n}(x_1,\dots x_{i-1},u_i, \dots u_n) \mathrm{d} u_i \,\mathrm{d} u_{i+1}\dots \mathrm{d}u_n} \end{align}

and

f_{X_1,\dots X_i}(x_1,\dots x_i) = \int \dots \int f_{X_1,\dots X_n}(x_1,\dots x_i,x_{i+1},\dots x_n) \mathrm{d} x_{i+1} \dots \mathrm{d} x_n

(notice, that these latter identities can be useful to generate a random variable (X_1, \dots X_n) with given distribution function f(x_1,\dots x_n)); the density of the marginal distribution is

f_{X_i}(x_i) = \int \dots \int \int \dots \int f_{X_1,\dots X_n}(x_1,\dots x_{i-1},x_i,x_{i+1},\dots x_n) \mathrm{d} x_1\dots \mathrm{d}x_{i-1} \, \mathrm{d}x_{i+1} \dots \mathrm{d}x_n.

The joint cumulative distribution function is

F_{X_1,\dots X_n}\left( x_1, \dots x_n\right)= \int_{-\infty}^{x_1} \dots \int_{-\infty}^{x_n} f_{X_1,\dots X_n}\left(u_1,\dots u_n\right) \mathrm{d} u_1 \dots \mathrm{d}u_n,

and the conditional distribution function is accordingly

\begin{align} F_{X_i| X_1, \ldots X_{i-1}}(x_i| x_1, \ldots x_{i-1})=   &\frac{\int_{-\infty}^{x_i}f_{X_1, \dots X_i}(x_1,\dots x_{i-1},u_i)\mathrm{d}u_i}{\int_{-\infty}^\infty f_{X_1, \dots X_i}(x_1,\dots x_{i-1},u_i) \mathrm{d} u_i}\\ = &\frac{\int_{-\infty}^\infty \dots \int_{-\infty}^\infty \int_{-\infty}^{x_i} f_{X_1, \dots X_n}(x_1,\dots x_{i-1},u_i, \dots u_n) \mathrm{d} u_i\dots \mathrm{d}u_n}{\int_{-\infty}^\infty \dots \int_{-\infty}^\infty \int_{-\infty}^\infty f_{X_1, \dots X_n}(x_1,\dots x_{i-1},u_i,\dots u_n) \mathrm{d} u_i \dots \mathrm{d} u_n}. \end{align}

Expectation reads

\mathbb{E}\left[h(X_1,\dots X_n) \right]=\int_{-\infty}^\infty \dots \int_{-\infty}^\infty h(x_1,\dots x_n) f_{X_1,\dots X_n}(x_1,\dots x_n) \mathrm{d} x_1 \dots \mathrm{d} x_n;

suppose that h is smooth enough and h(u_1,\dots u_n)=h(x_1,\dots x_n) for u_1 \ge x_1, \dots u_n\ge x_n, then, by iterated integration by parts,

\begin{align}\mathbb{E}\left[h(X_1,\dots X_n) \right]=& h(x_1,\dots x_n)+ \\ & (-1)^n \int_{-\infty}^{x_1} \dots \int_{-\infty}^{x_n} F_{X_1,\dots X_n}(u_1,\dots u_n) \frac{\partial^n}{\partial x_1 \dots \partial x_n} h(u_1,\dots u_n) \mathrm{d} u_1 \dots \mathrm{d} u_n.\end{align}

Joint distribution for independent variables

If for discrete random variables \ P(X = x \ \mbox{and} \ Y = y ) = P( X = x) \cdot P( Y = y) for all x and y, or for absolutely continuous random variables \ f_{X,Y}(x,y) = f_X(x) \cdot f_Y(y) for all x and y, then X and Y are said to be independent.

See also

External links

Posted in Business, Language of Business | Leave a comment