Suggestions for Assignment 2 - Ray Casting Spheres.
Ross Beveridge 9/24/2013

Top Level - Three Parts

Starting at the top of your design, there are three parts to the program. Read in the specification of the camera and the objects to be rendered. Generate the renderings. Write the resultant images to files. Reading and writing files should be straightforward and there isn't much to say here. The central part is best thought of as a doubly nested loop. The bounds on the Loop are the individual pixel indices in the horizontal and vertical direction. The action carried out at each pixel is to cast a ray into the scene and retrieve a depth value and a color value to the nearest sphere intersection.

Libraries and Data Structures

You will want to use a linear algebra package so that you are not reinventing primitive operations such as matrix multiply and in particular the dot product. If you are using C++ student have had good luck in the past using the Boost Linear Algebra Library. However, any good library should serve your purposes should you have a reason to pick another.

If you are using the Boost library it is already installed on the CS Department Linux machines and please do not try to include your own install: the library is large. If you are using another library check if it is available on our CS machines and if it is not include it with your submission. As you do this, please try to take into account size and if the overal package is becoming very large talk to myself of Jatin in advance.

The next question is when to create objects in the object oriented programming sense. Generally, how you choose to define objects is very much up to you. That said, it is hard for me to imagine not having an object called a Ray. It is also hard to imagine not having an object called a Sphere. Whether points and vectors rise to the level of having their own object definition is more a matter judgement. However, do consider this, operations in ray tracing happen millions of times and you should work hard to avoid excessive creation and destruction of objects in the most inner loops of your ray tracer.

Last but not least, you may want to build an object class to represent color images. This is not necessary, in so much as a color image is just a 3 dimensional array. However, object encapsulation may prove helpful in time. There is an open design issue when to use double floats and when to finally force images pixel values to be 8 bit integers. There is some virtue in keeping with floats until you near the very end of processing and must finally write out an image file.

Viewport to Pixel Transformations

There is an essential transformation that you must be able to compute in order to move between the units that define the image plane, often I will refer to this as the viewport, and the actual pixel coordinates of the final image you are rendering. Recall that our viewport will always be bounded between -1 to +1 in both the horizontal and the vertical dimension. These should be thought of as 3-D coordinates of the bounded image plane lying at the near clipping plane. For this assignment, when you write the loop to process each pixel you will need to take pixel coordinates in the form (i, j) and derive an equivalent real valued (i, j). You can look at the textbooks discussion of viewport transformations for more information on this. We will also review this in lecture.

A Ray from a Pixel

Presuming that you have an object defined which is a ray, then you will need to construct a new ray for each pixel being rendered. In general, you will need the 3-D world coordinates of the focal point, also called the perspective reference point, in order to create a ray. However, in this assignment you are safe to assume that the perspective reference point will always be at the origin (0, 0, 0). You also need the 3-D world coordinate of the pixel currently being rendered. We discussed in the previous item how to compute that based upon a pixel index.

Now we come to a slightly more complicated issue. Our efficient algorithm for intersecting rays with spheres works best if the distance from the point at which rays originate to the center of spheres is computed in advance for all spheres. This forces are hand somewhat with respect to where a Ray should originate. There are two choices, the rate may originate from the perspective reference point. Alternatively, the ray may originate from the actual 3-D coordinate of the pixel being rendered. In general, I have a preference for rays originating from the pixels because it will be easier to recover the true three-dimensional distance from the pixel to the image plane. However, you may find it easier in this assignment to instead launch your rays from the perspective reference point. I think it is also fair to say that most ray tracers lauch rays from the perspective reference point.

Whichever you choose, do take care that when you write into the depth image the distance to the nearest sphere you appropriately take this into account and don't include in your distance the distance from the focal point to the pixel itself. One last comment on this, a lot of rendering systems don't really care terribly much about this distinction and in many circumstances it will not matter much. However, being aware of the distinction early on is of value and should help you in understanding the underlying geometry of the ray casting procedure.

Last but not least, recall that the direction of the rate is always defined in the same manner. It is the direction of the vectors are obtained when the 3-D position of the pixel has the 3-D position of the perspective reference point subtracted from it. Then, most importantly, this direction vector is normalized to be of unit length.

Display Lists

The fundamental concept of the display list is utterly central to rendering systems. Using your favorite data structure for containing a variable number of objects, you should create the display list at the time you are reading in objects from the object file. As a point of background, the concept of the display list object is fundamental both to rendering engines based upon the perspective projection pipeline and polygon rendering an equally to rendering engines based upon ray tracing.

Foreach Pixel call Trace

If you are directed to render an image that is 256 pixels on each side than the very core of your code will be a doubly nested loop iterating over two indices that will run between 0 and 255: these will be the (i, j) pixel indices. Inside this loop will reside your call to a function most likely named "trace" that renders the associated pixel.

This function called "trace" takes one or two arguments depending on your preferences with respect to global variables. The first essential argument is the ray that is to be traced. The second argument is the display list of objects in the world. Often this latter argument is not made explicit and is instead treated as a global variable always accessible to any function running in the context of the ray tracer.

If you operate in a language that allows multiple value returns, then you actually do a multiple value return of the depth and color value associated with that Ray cast into your scene. Obviously and other languages you may pass in unfilled variables that will return that same information at the end of the call. Looking down the road, do keep in mind that we will be generalizing this function to become recursive as we move from ray casting to ray tracing.

Ray Sphere Intersection

Notice that now or plan for this assignment, in particular the plan for the function trace, has finally reached the level of our lecture on how to intersect 3-D Ray with a sphere. I strongly recommend that you actually write two distinct versions of this intersection routine. The first should follow the more brute force approach which is based upon the quadratic equation. The second version of the code should use the faster and more expedient variant taught second in class. It is a good idea to do both in part because you are then in a position to compare the results from one with the other and when they agree you will be much more confident that each is correct.

The Closest Object

Let us consider more carefully what is inside the function called "trace". Inside this function is a loop to iterate over all of the objects in the display list. Prior to entering this loop, variable initialization should facilitate the recording of the closest sphere to the camera. In other words, you're probably going to end up setting a variable to record the distance to the closest sphere and you're very likely to set that to be the largest possible double float. You also will of course need some variables in which to record information associated with the closest sphere found so far in the iteration.

Now the heart of this function trace is a loop testing for intersection between the ray and each sphere in the display list, and if an intersection is found then tests to see if it is the closest sphere yet found to intersect the ray. When this loop is finished you will have either the default values because the ray intersects no spheres, or you will have recorded the distance to the closest sphere as well as the color of that sphere. That is precisely the information which will then be returned by this function.

Color - The Most Basic Material Information

In this first assignment we take a small step towards what in general falls under the category of material properties. In other words, each sphere has associated with it a color represented as a red, green, blue triple This requires your code to carry out an essential operation. When you find the sphere closest to the camera, you must retrieve the color value from that sphere. Make sure that your design makes this process easy.