Floating Point Representation Basics

Sunday, 27 July 2014

The details in this post are acquired from IEEE 764 floating point standards. Usually a real number in binary will be represented in the following format- I_mI_m-1…I₂I₁I₀.F₁F₂…F_nF_n-1where I_m and F_n will be either 0 or 1 of integer and fraction parts respectively.

A finite number can also be represented by four integer components : a sign (s), a base (b), a significand (m), and an exponent (e). Then the numerical value of the number is evaluated as (-1)^s x m x b^e.In the binary32 format the floating point numbers are stored in the single precision format. In the single precision format there are 23 bits for the significand 8 bits for the exponent and 1 bit for the sign.

For example, the rational number 9÷2 can be converted to single precision float format as following,

9₍₁₀₎ ÷ 2₍₁₀₎ = 4.5₍₁₀₎ = 100.1₍₂₎

The result said to be normalized, if it is represented with leading 1 bit, i.e. 1.001(2) x 22. Omitting this implied 1 on left extreme gives us the mantissa of float number.

In other words, the above result can be written as (-1)⁰ x 1.001₍₂₎ x 2² which yields the integer components as s = 0, b = 2, significand (m) = 1.001, mantissa = 001 and e = 2. The corresponding single precision floating number can be represented in binary as shown below,

Where the exponent field is supposed to be 2, yet encoded as 129 (127+2) called biased exponent.

One of the goals of the IEEE floating point standards was that you could treat the bits of a floating point number as a (signed) integer of the same size, and if you compared them that way, the values will sort into the same order as the floating point numbers they represented.
If you used a twos-complement representation for the exponent, a small positive number (i.e., with a negative exponent) would look like a very large integer because the second MSB would be set. By using a bias representation instead, you don't run into that -- a smaller exponent in the floating point number always looks like a smaller integer.

A bias of (2^n-1 – 1), where n is # of bits used in exponent, is added to the exponent (e) to get biased exponent (E). So, the biased exponent (E) of single precision number can be obtained as

E = e + 127

The range of exponent in single precision format is -126 to +127. Other values are used for special symbols.

Double Precision Format:

Precision:
The smallest change that can be represented in floating point representation is called as precision. The fractional part of a single precision normalized number has exactly 23 bits of resolution, (24 bits with the implied bit). This corresponds to log₍₁₀₎ (2²³) = 6.924 = 7 (the characteristic of logarithm) decimal digits of accuracy. Similarly, in case of double precision numbers the precision is log₍₁₀₎ (2⁵²) = 15.654 = 16 decimal digits.

Accuracy:
Accuracy in floating point representation is governed by number of significand bits, whereas range is limited by exponent. Not all real numbers can exactly be represented in floating point format. For any numberwhich is not floating point number, there are two options for floating point approximation, say, the closest floating point number less than x as x- and the closest floating point number greater than x as x+. A rounding operation is performed on number of significant bits in the mantissa field based on the selected mode. The round down mode causes x set to x_, the round up mode causes x set to x+, the round towards zero mode causes x is either x- or x+ whichever is between zero and. The round to nearest mode sets x to x- or x+ whichever is nearest to x. Usually round to nearest is most used mode. The closeness of floating point representation to the actual value is called as accuracy.

Special Bit Patterns:
The standard defines few special floating point bit patterns. Zero can’t have most significant 1 bit, hence can’t be normalized. The hidden bit representation requires a special technique for storing zero. We will have two different bit patterns +0 and -0 for the same numerical value zero. For single precision floating point representation, these patterns are given below,

0 00000000 00000000000000000000000 = +0
1 00000000 00000000000000000000000 = -0

Similarly, the standard represents two different bit patters for +INF and -INF. The same are given below,
0 11111111 00000000000000000000000 = +INF
1 11111111 00000000000000000000000 = -INF

All of these special numbers, as well as other special numbers (below) are subnormal numbers, represented through the use of a special bit pattern in the exponent field. This slightly reduces the exponent range, but this is quite acceptable since the range is so large.

An attempt to compute expressions like 0 x INF, 0 ÷ INF, etc. make no mathematical sense. The standard calls the result of such expressions as Not a Number (NaN). Any subsequent expression with NaN yields NaN. The representation of NaN has non-zero significand and all 1s in the exponent field. These are shown below for single precision format (x is don’t care bits),
x 11111111 1m0000000000000000000000
Where m can be 0 or 1. This gives us two different representations of NaN.
0 11111111 110000000000000000000000 _____________ Signaling NaN (SNaN)
0 11111111 100000000000000000000000 _____________Quiet NaN (QNaN)
Usually QNaN and SNaN are used for error handling. QNaN do not raise any exceptions as they propagate through most operations. Whereas SNaN are which when consumed by most operations will raise an invalid exception.

Overflow and Underflow:
Overflow is said to occur when the true result of an arithmetic operation is finite but larger in magnitude than the largest floating point number which can be stored using the given precision. Underflow is said to occur when the true result of an arithmetic operation is smaller in magnitude (infinitesimal) than the smallest normalized floating point number which can be stored. Overflow can’t be ignored in calculations whereas underflow can effectively be replaced by zero.

Posted By:

saswat raj

Segmented Sieve of Eratosthenes

If you have enough space to store all primes upto sqrt(b) then you can sieve for primes in the range a to b using additional space O(b - a) :

To do this let us take an example of the #SPOJ problem 2 : PRIME1 where we are given the upper limit as 1000000000 which is quite big if we want to store all the numbers in an array although we can store them by reducing them to array where each bit of a number stores whether the corresponding number is a prime or not i.e. if there are 64 numbers in the sieve we need two integers of 32 bits to store information of whether the number is prime or not.

Another method is if you have space to store upto square root of upper limit using sieve of eratosthenes then we can calculate the sieve of b-a from the sieve already present. The way to do it is mentioned below :

1. First get the sieve upto the square root of upper limit which in this case is 1000000000. For 1000000000 the square root comes near 31625. Hence we create an array pre_sieve[31625] and fill it with the values:

pre_sieve[31265] = {1};
for(int i=2;i<=177;i++)
{
    if(pre_sieve[i] == 1)
    {
         for(int j = i*i;j< = 31264; j+=i)
             pre_sieve[j] = 0;
    }
} 
vector<int> primes;
for(int i=2;i<=31264;i++)
if(pre_sieve[i] == 1)
    primes.push_back(i);

Now for a range [a,b] we need to find the closest number that is in the range and is a composite number of the prime p. To do this we can divide the start of the range by p and take the floor and then multiply the resulting by p.We can then iterate in the sieve to get the primes.

for(int i=0;i<primes.size();i++)
{
     int *arr = (int *)malloc(sizeof(int)*(b-a+1));
     memset(arr,1,b-a+1);
     int prime = primes[i];
     int low = a/prime;
     low = low * prime;
     for(int j = low * 2;j<=b;j+= prime)
             arr[j] = 0 
}

Again you can reduce the space further by only sieving the odd number.

Posted By:

saswat raj

Autotools Linux : Served On The Platter #2

Monday, 10 March 2014

Autotools, Linux, servedontheplatter

Autotools are a great way of packaging binary applications in Linux.The autotools need to be installed by installing the auto-tools-dev and other dependencies.It uses the old Makefile way for compiling the program and checking dependencies. Here in this we take a sample C program depending on wxWidgets and openGL and build the install scripts using Autotools.Note that the directory structure is as follows :
|___Canvas
           |____src [ the source folder ]
           |         |__paint.cpp
           |____res [ the resources and data ]

Autotools uses a predefined directory for installing files and these directories are named in the autotools site.For eg. $PREFIX stands for /usr/local/ hence when you write your program make sure to use these locations.

Autoscan: We use an automated tool to generate the initial configure.scan script. In the root directory run 'autoscan' command.This will generate a configure.scan and a report there by traversing through all the directories for dependencies. Now rename this to 'configure.ac' and edit it as follows.

#                                               -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.

AC_PREREQ([2.69])
AC_INIT([canvas], [1.0], [saswat.cs11@iitp.ac.in])
AM_INIT_AUTOMAKE([-Wall foreign])
AC_CONFIG_SRCDIR([src/paint.cpp])
AC_CONFIG_HEADERS([config.h])

# Checks for programs.
AC_PROG_CXX

# Checks for libraries.
AM_OPTIONS_WXCONFIG
reqwx=2.4.0
AM_PATH_WXCONFIG($reqwx, wxWin=1)
if test "$wxWin" != 1; then
 AC_MSG_ERROR([
  wxWidgets must be installed on your system.
 
  Please check that wx-config is in path, the directory
  where wxWidgets libraries are installed (returned by
  'wx-config --libs' or 'wx-config --static --libs' command)
  is in LD_LIBRARY_PATH or equivalent variable and
  wxWidgets version is $reqwx or above.
  ])
fi
 
CPPFLAGS="$CPPFLAGS $WX_CPPFLAGS"
CXXFLAGS="$CXXFLAGS $WX_CXXFLAGS_ONLY"
CFLAGS="$CFLAGS $WX_CFLAGS_ONLY"
LIBS="$LIBS $WX_LIBS -lGL -lGLU -lglut `wx-config --gl-libs`"

# Checks for header files.

# Checks for typedefs, structures, and compiler characteristics.
AC_CHECK_HEADER_STDBOOL

# Checks for library functions.
AC_FUNC_MALLOC


AC_OUTPUT(Makefile src/Makefile res/Makefile)

Now notice the Macros in the section.AC_INIT initialize the package name and version and bug to send mail.AC_PROG_CXX checks for C++ in the system.We then check for wxWidgets in the system.This is present in their wiki for code above.Next add the FLAGS for compilation of librariesAC_OUTPUT checks for the Makefiles that should be generated as mentioned in brackets.Each must have a Makefile.am ready in them.

MAKEFILES: Each makefile.am should be as follows.In the source directory the Makefile is as follows:

bin_PROGRAMS = canvas
canvas_SOURCES = paint.cpp

The bin_PROGRAMS specifies the following binaries to be compiled and stored in the $PREFIX/bin directory.The _SOURCES specify the sources of the binaries.For the data directory the makefile should be as follows:

resdir = $(datarootdir)/@PACKAGE@
res_DATA = canvas.png chk.png crop.png curve.png erase.png ...

We define a data directory according to the one used in the binary and the data files will be written to it.

COMMANDS: Once these files are ready run the following commands in order:
* aclocal
* automake --add-missing
* autoconf
* autoreconf -fvi
You will get your scripts.Run configure and make install to install.This project can be found in my github account : https://github.com/saswatraj/Canvas

Posted By:

saswat raj

Objects in Javascript : Served on the Platter

Sunday, 9 March 2014

servedontheplatter

The simple types of javascript are : numbers , booleans , strings , null and undefined. All other types are objects in javascript. Hence a detailed knowledge is required for manipulating objects in javascript. Remember : Objects in javascript are class free. They are basically name value pairs surrounded by curly braces.Objects are always passed around by reference.

Initialization :

var empty_object = {};

var exobj = {

        'name':'saswat',

        'occupation':'student'

};

Retireval :

exobj['name'] //returns 'saswat'

exobj.name    //preferred and more cleaner

exobj.status  // not present hence returns undefined

For filling in default values for undefined objects use the '||' operator.Attempting to retrieve values from undefined will throw a TypeError exception.

Updation :

Updation can be done by the assignment operator. If the object does not have the property name then it is augmented to the property names present with the value that is assigned.

Prototype :

Every object in javascript inherits properties from some other object. All objects created directly inherit from the Object.prototype an object that comes along with javascript.

If we try to retrieve a value from an object and that property is not present javascript tries to retrieve the property from the obejcts prototype and then from the prototype of the prototype and so on till it reaches the Object.prototype. If it is found nowhere in the chain then it is undefined.This is called delegation in javascript.

A detailed description on Prototype will be provided in the next few platters of inheritance in javascript.However for now we can use Object.create function which takes a prototype from which to derive the new Object. One can also assign the protoype property with an object . i.e.

exobj.prototype = new User() ; //user as prototype object

exobj.prototype = Object.create(User.prototype);

The difference between the two is that in Object.create the constructor is not called hence the object remains uninitialized.

If we add a new property to the prototype then the new property will show in each object derived from that prototype.

But then how do we know whether the property belongs to the object or to the prototype chain.This can be known by using the hasOwnProperty which does not look into the property chain.For example:

exobj.hasOwnProperty('name') //true

exobj.hasOwnProperty('constructor') //false

Enumeration :

For enumerating the properties of the object use the for in construct .This includes the prototype chain properties.Hence we need to filter using typeof and hasOwnProperty. Careful Note :: The elements can come in any order.

for ( property in exobj ){

    if( exobj.hasOwnProperty(property) && typeof exobj[property] == 'string')

      //do something

Deletion : The delete operator can be used to remove a property from object. Deleting may allow the prototype property to be accessible.

Posted By:

saswat raj

HTTP 1.1 headers

Wednesday, 29 January 2014

When a browser makes a request to a server it does in the form of requests formatted by some rules so that the server can understand what resource the browser is asking for.A normal REST http header would look like this :

POST /enlighten/rest HTTP/1.1
Host: api.opencalais.com
Content-Type: application/x-www-form-urlencoded
Content-Length: length

licenseID=string&content=string&/paramsXML=string

The response from the server side would look something like :

HTTP/1.1 200 OK
Content-Type: text/xml;charset=utf-8
Content-Length: length

string

Parts of the HTTP Request and their Significance

 Accept: audio/*; q=0.2, audio/basic

The Accept request-header field can be used to specify certain media types which are acceptable for the response.If no Accept header field is present, then it is assumed that the client accepts all media types.

  Accept-Charset: iso-8859-5, unicode-1-1;q=0.8

The Accept-Charset request-header field can be used to indicate what character sets are acceptable for the response.The special value "*", if present in the Accept-Charset field, matches every character set (including ISO-8859-1) which is not mentioned elsewhere in the Accept-Charset field

  Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0

The Accept-Encoding request-header field is similar to Accept, but restricts the content-codings that are acceptable in the response.

  Accept-Language: da, en-gb;q=0.8, en;q=0.7

Accept-Lnguages tells about the language that is preferred in the response of the output.

  Accept-Ranges: bytes

The Accept-Ranges response-header field allows the server to indicate its acceptance of range requests for a resource.This is what the internet download manager looks for to tell whether a particular resource is resumable or not.

  age-value = delta-seconds

The Age response-header field conveys the sender's estimate of the amount of time since the response (or its revalidation) was generated at the origin server. A cached response is "fresh" if its age does not exceed its freshness lifetime.Age values are non-negative decimal integers, representing time in seconds.

  Allow: GET, HEAD, PUT

The Allow entity-header field lists the set of methods supported by the resource identified by the Request-URI. The purpose of this field is strictly to inform the recipient of valid methods associated with the resource. An Allow header field MUST be present in a 405 (Method Not Allowed) response.

  Authorization  = "Authorization" ":" credentials

A user agent that wishes to authenticate itself with a server-- usually, but not necessarily, after receiving a 401 response--does so by including an Authorization request-header field with the request. The Authorization field value consists of credentials containing the authentication information of the user agent for the realm of the resource being requested.

  Cache-Control   = "Cache-Control" ":" 1#cache-directive

The Cache-Control general-header field is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain. The directives specify behavior intended to prevent caches from adversely interfering with the request or response.

The following Cache-Control response directives allow an origin server to override the default cacheability of a response:

public: Indicates that the response MAY be cached by any cache, even if it would normally be non-cacheable or cacheable only within a non- shared cache. (See also Authorization, section 14.8, for additional details.)
private: Indicates that all or part of the response message is intended for a single user and MUST NOT be cached by a shared cache. This allows an origin server to state that the specified parts of the
: response are intended for only one user and are not a valid response for requests by other users. A private (non-shared) cache MAY cache the response.; Note: This usage of the word private only controls where the response may be cached, and cannot ensure the privacy of the message content.
no-cache: If the no-cache directive does not specify a field-name, then a cache MUST NOT use the response to satisfy a subsequent request without successful revalidation with the origin server.

  Connection: close

The Connection general-header field allows the sender to specify options that are desired for that particular connection and MUST NOT be communicated by proxies over further connections.

  Content-Length: 3495

The Content-Length entity-header field indicates the size of the entity-body, in decimal number of OCTETs, sent to the recipient or, in the case of the HEAD method, the size of the entity-body that would have been sent had the request been a GET.

  Content-MD5   = "Content-MD5" ":" md5-digest
  md5-digest   = <base64 of 128 bit MD5 digest as per RFC 1864>

The Content-MD5 entity-header field, as defined in RFC 1864 [23], is an MD5 digest of the entity-body for the purpose of providing an end-to-end message integrity check (MIC) of the entity-body. (Note: a MIC is good for detecting accidental modification of the entity-body in transit, but is not proof against malicious attacks.)

For more headers information check RFC section for HTTP headers.

Posted By:

saswat raj

Rotation Point in Sorted Rotated Array

Thursday, 19 September 2013

Algorithms, Arrays, Rotation Point, Sorting

Question : What is the optimal algorithm to find the minimum element given a rotated sorted array of integers? A rotated sorted array of integers is the output array of a rotation operation performed on a sorted array. Eg: 3 5 6 1 2

Answer : One way to solve the above problem in linear time is to get the minima from the array using difference array and then finding the sign change from positive to negative.This will solve the problem in O(n) time.

/*
* Solution for O(n) time complexity
*/
int getRotationPoint(int *arr,int length)
{
    int *diff=(int *)malloc(sizeof(int)*length-1);
    if(length==1)
    return 0;
    assert(length>=2);
    for(int i=1;i0)?PS:NG;
    for(int i=0;i0)?PS:NG;
        if((sg==NG)&&(sign==PS))
        {
            index=i;
            break;
        }
        sign=sg;
    }
    return index+1;
}

However for an even tighter bound we modify the binary search algorithm and use it.We also could use the fact that for a sorted array, if the array were circular the index of the smallest would be the index after the largest element.Hence this solution is a O(log n) solution.

/*
* Solution for O(nlogn) time
*/
int getRotationPointBi(int *arr,int length)
{
    int first=0;
    int last=length-1;
    int middle=(last+first)/2;
    while(last>first)
    {
        middle=(last+first)/2;
        if(arr[first]>=arr[middle])
        last=middle;
        else if(arr[middle]>=arr[last])
        first=middle;
    }
    return middle+1;
}

Posted By:

saswat raj

Binary Search Trees : Interview Question

Wednesday, 18 September 2013

Question : In a BST to every element add sum of elements greater than it.

Answer: The first thing is to notice that on doing this the resulting tree would be the reverse of the binary search tree. Implementing this you should notice that the inorder travesal prints the elements in increasing order.Hence we need to modify the inorder traversal a bit to do this.Here is a sample C++/C code for insert and inorder for a BST.

#include<iostream>
#include<cstdlib>

using namespace std;

struct node
{
    int data;
    struct node *parent;
    struct node *left;
    struct node *right;
}*root;

typedef struct node node;

/**
*Makes a new node from the integer
*value and returns a pointer to the
*new created node
*/
node* makeNode(int num)
{
    node *temp=(node*)malloc(sizeof(node));
    temp->data=num;
    temp->right=NULL;
    temp->left=NULL;
    return temp;
}

/**
 * Inserting a node in a BST
 */
void insert(int num,node *T)
{
    if(T==NULL)
    root=makeNode(num);
    else
    {
        if(T->data>num)
        {
            if(T->left==NULL)
            T->left=makeNode(num);
            else
            insert(num,T->left);
        }
        else
        {
            if(T->right==NULL)
            T->right=makeNode(num);
            else
            insert(num,T->right);
        }
    }
}

/**
*Inorder traversal of the tree
*/
void inorder(node* T)
{
    if(T!=NULL)
    {
        inorder(T->left);
        cout<<T->data<<" ";
        inorder(T->right);
    }
}

Now to add the elements which are greater than a particular node we need to reverse the inorder by accessing the bigger elements i.e. right first and then the left. On accessing the right add the data to the data already collected and then change the node value accordingly.Hope you better understand by the code.

void chorder(node *T,int *data)
{
    if(T!=NULL)
    {
        chorder(T->right,data);
        *data+=T->data;
        T->data=*data;
        chorder(T->left,data);
    }
}

Notice that that data needs to be reference so that after being updated it is assigned to the other nodes.

Posted By:

saswat raj

Floating Point Representation Basics

Segmented Sieve of Eratosthenes

Autotools Linux : Served On The Platter #2

Objects in Javascript : Served on the Platter

HTTP 1.1 headers

Rotation Point in Sorted Rotated Array

Binary Search Trees : Interview Question

code-craft

Blog Archive