基于硬件的CORDIC算法的高精度、高速手持科学计算器设计

中文翻译
基于硬件的CORDIC算法的高精度、高速手持科学计算器设计
本文提出了一种高精度、硬件高效的手持科学计算器CORDIC结构。这个结构将计算标准的超越函数。CORDIC结构不使用任何微积分方法，如有理函数或多项式函数。基于移位和加法操作，可以在硬件上完成完整的结构实现。本文对CORDIC级数与角度收敛的关系进行了详细的分析。提出了一种利用细粒法和粗粒法减小基于ROM的查表(LUT)尺寸的新技术。提出的体系结构将计算所有基本的初等函数，如对数函数、三角函数和指数函数。
电子计算器已经存在近50年了。早期的电子计算器体积庞大、笨重，需要相当多的知识才能使用。但是在60年代中期，王实验室的王博士对电子计算器做了许多改进，最终导致了我们今天所熟知的计算机。惠普(Hewlett-Packard)等其他制造商也忙于早期的计算器设计。惠普在20世纪60年代末发布了世界上第一台袖珍计算器。然而，这些设计中最大的改进是微电子技术的使用。没有硅芯片，惠普就永远不会发明袖珍计算器。有史以来最伟大的微电子设备实际上是一个计算器芯片。英特尔4000，世界第一的微处理器，虽然它从来没有被它的原始承包商在计算器中使用，i4000成为所有微处理器最终被设计出来的例子。相反，计算机世界进入了一个完全不同的方向——密集的硬件。计算器芯片不是微处理器。相反，它们是复杂的大规模集成设备，其功能类似于微处理器，但实际上是在密集的硬件中工作。这使得微处理器可以用于其他设备，并最终用于小型微型计算机。与此同时，计算器芯片演变成一个紧凑的，高晶体管计数专用设备，简化到基本和廉价制造。与天在1970年代当袖珍计算器芯片包含一个正常的包装(通常24或28针)安装在常规的印刷电路板通孔，如今计算器芯片是如此常见和廉价的形式是打孔芯片，环氧blob到印刷电路板和直接连接到铜的痕迹。在许多情况下，芯片直接绑定到显示基板上，形成一个完整的集成芯片/显示包。这种制造方法非常便宜，但它使常规的DIP封装芯片不再需要，因此不再可用。
在计算器中，计算基本的和超越的数学函数，如正弦、余弦、对数和其他函数是现代计算系统中必需的函数。这些函数可以用几种方法中的任何一种来计算其域中的任意点。这些方法中最著名的有泰勒级数展开法、切比雪夫级数展开法、CORDIC法和导数法、布里格斯对数法、牛顿法和多项式逼近法。这些方法主要不同于它们需要的基本操作，例如加法、乘法和阶乘计算，以及它们需要的迭代次数，以产生给定精度的结果。在这些方法中，最流行的集成电路实现的计算器和微处理器是CORDIC方法。这种方法的流行是因为它只需要使用相对简单的加法和移位运算的基本运算，以及可以用这种方法计算广泛的三角函数和指数函数。
CORDIC (COrdinate Rotation DIgital Computer)是一种迭代算法，它只使用移位加运算。它可以在线性、圆形和双曲坐标下对二维向量进行旋转。CORDIC在信号处理和矩阵运算中有着广泛的应用，例如在各种流行的变换中。在过去的十年或更久，广泛的研究致力于发展现代信号处理算法和方法，这些具有广泛的潜力，如果这些可以实现实时使用低成本的硬件解决方案。这类算法的实时性和计算复杂度往往很高，通常显著高于更传统的算法。基于快速傅里叶变换(FFT)、有限脉冲响应(FIR)滤波技术，在许多情况下达到了现有技术所能达到的极限。因此，对并行体系结构和将矩阵算法映射到此类体系结构的系统方法也进行了大量的研究。
长期以来，数字信号处理领域一直由微处理器主导，微处理器的功能得到了增强作为单周期乘累加指令和特殊的寻址模式。虽然这些处理器成本低，提供了极大的灵活性，但对于真正需要DSP的任务，它们的速度往往不够快。可重构逻辑计算机的出现使得专用硬件解决方案的速度更快，而且成本也与传统软件方法相竞争。不幸的是，为这些基于微处理器的系统优化的算法不能很好地映射到硬件中。虽然经常存在硬件有效的解决方案，但是软件系统的优势使这些解决方案不受关注。在这些硬件效率较高的算法中，有一类针对三角函数和其他超越函数的迭代解，它们只使用移位和加法来执行。三角函数是基于向量旋转的，而其他函数，如平方根，则使用所需函数的增量表达式来实现。三角算法叫做CORDIC。增量函数是通过对硬件体系结构的一个非常简单的扩展来执行的，虽然严格意义上不是CORDIC，但是由于非常相似，通常会包含增量函数。CORDIC算法通常为每次迭代产生额外的一位精度。
三角CORDIC算法最初是作为一种用于实时导航的数字解决方案而开发的问题。原著作者是杰克沃尔德。CORDIC算法已经被广泛应用，包括8087数学协处理器、HP-35计算器、雷达信号处理器和机器人。CORDIC旋转也被用于计算离散傅里叶、离散余弦、奇异值分解和求解线性系统。
先验函数在广泛的工程应用中是必不可少的，如图像处理、机械手运动学、数字信号处理(DSP)、优化过程等，在这些过程中，必须使用硬件/科学计算器高效地计算大量三角和/或指数运算。
CORDIC计算的基本概念是将期望的旋转角分解为加权和一组预定义的基本旋转角，以便通过简单的移位和添加操作完成每个角的旋转。所有的三角函数都可以用矢量旋转来计算或推导。矢量旋转也可用于极坐标到直角坐标和直角坐标到极坐标的转换，矢量大小也可作为某些变换的基本单元，如DFT和DCT。CORDIC算法可以通过迭代一组耦合方程来执行基本的算术运算(乘法、除法)以及基本三角函数和双曲函数的计算，这些耦合方程只涉及加法、减法和位移位。

(1)

(2)

(3)
角的旋转可以在几个步骤中执行，使用一个迭代的过程。每一步完成旋转的一小部分。许多步骤将构成一个平面旋转。单步定义为：

(4)
通过消除cosn因子，可以对上述方程进行修正，并将算法简化为几个简单的移位和加法。通过对最终结果的预计算，可以消除系数。将余弦值转化为常数。所以用常数表示K：

(5)
K对于所有初始向量和旋转角度的所有值都是常数，通常称为聚集常数。导数P(约1.64676)在这里定义。系数K在a处预先计算并考虑，基数2的后期。m引入了三种操作模式是圆的、线性的和双曲的运算。

(6)

(7)
X和Y值为n次迭代后的最终计算结果，如图1所示，矢量旋转。

(8)

对于旋转的每一步Sn都被计算为Zn的符号。

或以类似程序的方式：
对于i = 0到n-1

从上述算法和CORDIC方程中求出的所有超越函数具有较好的计算精度。对于不同的操作模式，选择、输入和输出如表1所示。
从表1中我们可以计算所有的超越函数我们也可以从正弦和余弦值驱动正切值。atan(2-n)预先计算并存储在表中。因素不会保持不变或可预测。为了补偿它，可能需要复杂的硬件结构(如图2所示)或类似的后处理周期。
CORDIC可以在两种不同的模式下运行：(i)旋转和(ii)矢量，其中z或y变量分别通过一系列迭代被迫为零。
在旋转模式下，用所需的旋转角度初始化角度累加器。每次迭代的旋转决策都是为了减小角度累加器中剩余角的大小。每次迭代的决策都是基于每一步后剩余角度的符号。

这种算法通常称为驱动Z到0。此时，X和Y的值是cos和sin。CORDIC core计算正弦和余弦值：

(9)
有一个特殊的情况是驱动Z到0(极坐标到矩形的转换)。在矢量模式下，CORDIC旋转器将输入矢量旋转到与x轴对齐所需的任何角度。矢量操作的结果是旋转角度和原矢量的缩放大小。矢量函数决定旋转方向。如果角度累加器初始化为零，则在迭代结束时它将包含被遍历的角度。这种算法通常称为驱动Y到零0从矩形到极坐标的转换)。CORDIC核心然后计算大小和相位：

(10)
这些模态均可用于圆坐标系、线性坐标系和双曲坐标系中计算各种函数。经典的CORDIC方法存在五个主要缺点：(1)需要尺度因子补偿；(2)输入变量的大小限制；(3)执行速度慢。(4)CORDIC 角度量化效果；(5)IEEE标准下输入角度和输出值的标准表示。虽然第一个缺点需要额外的乘法运算，但第二个缺点对计算函数的精度产生了重大影响，因为它取决于变量y(矢量模式)或z(旋转模式)驱动到零的距离。如果初始输入位于称为收敛范围(±99.88)的某个范围内，则可以将它们驱动到接近于零。第三个缺点来自CORDIC算法的迭代特性。第四个缺点是基于迭代性质的角度近似，最后一个缺点是基于输入表示。
这项工作的主要目的是开发一种CORDIC处理器，它功耗低，不存在比例因子补偿问题，整个坐标空间的收敛范围，免查表(ROM)。更具体地说，我们专注于旋转和矢量操作模式，z到0或y到0在笛卡尔坐标系中。
在传统的CORDIC和Matlab仿真的基础上，采用两种方法对CORDIC方法的性能和时间效率进行了研究和评价。首先，确定绝对误差约束方法，从而将绝对误差作为CORDIC计算结果与Matlab仿真结果的不同期望误差给出。对步长为5度的0~90度角进行分析(如图3所示)，迭代16时最小误差为0.0000531。(如表2所示)

从图和表中我们可以得出结论，16次迭代足够计算所有角度的余弦值和正弦值。按照传统算法，当迭代次数no增加时，结果的精度会提高，但在角度传感器中，小数点精度的最大值no为2。我们可以求出小数点后2位的精度。对于2点的十进制精度，16次迭代就足以以完美的精度计算出三角函数值。

通常，我们将把初始迭代看作是来自查找表的atan(2-i)。初始迭代k之后，从k+1迭代开始，我们不从查表中取角度，而是取2-i为角度，这是迭代数的移位版本。这样我们就可以最小化查找位置的no，这样我们就可以减小内存大小。减小了查找表的内存大小，对于计算器来说将是高效的，可以对所有的初等函数进行基本的计算。对于k值和角度误差可以进行计算，并根据错误率选择k值。为便于比较，从表中得出数据的错误率。当角度超过5次迭代时，迭代数(2-i)的移位版本几乎等于从查表中得到的弧度，我们可以从移位的迭代数中得到角度，而不是从查表中得到角度。
例如在第9次迭代中，对于弧度中的等效角度，来自查找表(度)0.1119056770662746的角度为0.001953123，(2^-9)的移位器版本为0.001953125。从这个例子中我们可以得出结论，经过一定的迭代后，平移后的版本和向上看的角度是相同的。通过计算常规计算与细粒度计算之间的错误率，可以对迭代过程进行分析。在常规计算中，角值和细粒值应相同，使角误差最小。如果角度误差较大，计算结果将与常规计算结果不匹配。为了得到高精度的结果，应尽量减小角度误差。经过5次迭代，错误率降到1.4635*10^-5(如表3所示)。从角度计算，16次迭代后的精度为4个小数点，因此5次迭代后，我们可以用移位代替LUT取值。误差计算表如下。

内存访问时间需要两个周期来搜索数据，标签检查和ROM读取数据。如果我们从迭代号的移位版本取角度，一个时钟就足够移动结果，每个阶段都有移位器来移动。所以不需要任何额外的计算和手持科学计算器我们可以节省内存的大小和内存大小比较建议的体系结构，在传统算法需要的内存大小的总数576位16迭代，但在建议的体系结构中，我们只需要180位的内存大小，我们可以减少内存大小超过三次的原始尺寸，这将更适用精度高。内存计算基于2位标签大小和32位单精度浮点数的角度表示。
使用RC编译器- cadence 6.1 ASIC设计工具对radix-2 CORDIC实现进行了性能分析。结果与不同的库进行了比较。对于180nm技术，速度正常。使用lib库，对于90nm技术，自由使用。对于45nm技术，自由使用。当技术缩短最坏路径延迟时，面积和净功率就会下降。表4给出了单级radix-2 CORDIC结构在圆形旋转模式下的综合结果。表5给出了16级radix-2 CORDIC结构在圆形旋转模式下的综合结果。综合设计时发现，45nm工艺的最大时钟频率为37.7MHz。这对于实现浮点加法器所需的复杂性是合理的。

本文介绍了高精度、高速手持科学计算器的设计与分析。该处理器的计算精度与传统的CORDIC处理器相近。该处理器的硬件复杂度低于传统的CORDIC处理器。然而，这项工作表明，该处理器的硬件成本可以显著降低。与传统的CORDIC相比，该处理器的功耗更低，因为实际所需的算术运算次数大大减少，从而提高了计算机的计算能力和精度。

外文原文
High Precision and High Speed handheld Scientific Calculator Design
This paper propose the high precision and hardware efficient CORDIC structure for handheld scientific calculator. This structure will compute the standard transcendental functions. CORDIC structure does not use any calculus method such as rational or polynomial functions. The complete structure realization in hardware can be done based on shift and add operation. This paper propose the detailed analysis about number of CORDIC stages versus angle convergence. A novel technique to reduce the ROM based look up table(LUT) size by using fine and coarse grain methods are proposed. Proposed architecture will compute all basic elementary function like logarithmic, trigonometric and exponential functions.
Electronic calculators have been around now for nearly 50 years. Early forms of electronic calculators were bulky, cumbersome and took quite a bit of knowledge to use. But in the mid 1960's Dr. Wang of Wang Laboratories made many improvements to the electronic calculator, eventually leading to the machine we know so well today. Other manufacturers like Hewlett-Packard were also busy with early calculator designs. HP released the world's first "pocket" calculator in the late 1960's. The greatest improvement to any of these designs however was the use of microelectronics. Without the silicon chip Hewlett-Packard could have never invented the "pocket" calculator. And the greatest microelectronic device ever created was actually intended to be a calculator chip. Intel 4000, the world's first microprocessor, although it was never used by its original contractor in a calculator, the i4000 became the example for which all microprocessors were eventually designed. Instead, the calculator world went in a totally different direction-dense hardware. Calculator "chips" are not microprocessors. Instead, they are complicated large-scale integration devices which act "kinda-like" a microprocessor but actually do their work in dense hardware. This left the microprocessor available for use in other devices, and eventually in small mini computers. In the meantime, the calculator chip evolved into a compact, high transistor count specialized device, simplified down to the basics and manufactured on the cheap. Unlike the days back in the 1970's when pocket calculators contained a "normal" packaged chip (typically 24 or 28 pin DIP package) mounted thru-hole on a regular PC board, nowadays calculator chips are so common and so cheap they are made in the form of "die cut" chips, epoxy "blobbed" onto a PC board and wired directly to the copper traces. In many cases the chip is bonded directly to the display substrate forming one complete integrated chip/display package. This method of manufacturing is very inexpensive, but it makes it so that a regular DIP package chip is no longer needed and therefore, no longer available.
In the calculator, computation of elementary and transcendental mathematical functions such as sine, cosine,logarithms and others is a required function in modern computing systems. These functions may be evaluated for any point in their domain by any of several methods. Best known among these methods are the Taylor series expansion, the Chebyshev series expansion, the CORDIC method and derivatives, Briggs's method for logarithms, Newton's method and polynomial approximation. These methods vary principally in the primitive operations they require, such as addition, multiplication and factorial evaluation, and the number of iterations they require to produce a result of given accuracy. Among these methods most popular in integrated circuit implementations for calculators and microprocessors is the CORDIC method. The popularity of this method stems from its need to use only the relatively simple primitive operations of addition and shift operations, and the wide range of trigonometric and exponential functions which may be evaluated with the method.
The CORDIC (COrdinate Rotation DIgital Computer) is an iterative algorithm that uses only shift-and-add operations. It can perform the rotation of a two-dimensional (2-D) vector in linear, circular, and hyperbolic coordinates. CORDIC has a wide range of applications in signal processing and matrix operations, such as in various popular transforms. Over the past decade or more, extensive research has been devoted to the development of modern signal processing algorithms and methods which have widespread potential provided these can be implemented in real-time using cost effective hardware solutions. The real-time, computational complexity of such algorithms tends to be high, usually significantly greater than more conventional [e.g., fast Fourier transform (FFT), finite-impulse response (FIR) filter based techniques, in many cases reaching the limits of what is achievable with current technology. As a consequence, considerable research has also been undertaken into parallel architectures and systematic methodologies for mapping matrix algorithms onto such architectures.
The digital signal processing landscape has long been dominated by the microprocessors with enhancements such as single cycle multiply-accumulate instructions and special addressing modes. While these processors are low cost and offer extreme flexibility, they are often not fast enough for truly demanding DSP tasks. The advent of reconfigurable logic computers permits the higher speeds of dedicated hardware solutions at costs that are competitive with the traditional software approach. Unfortunately, algorithms optimized for these microprocessors based systems do not map well into hardware. While hardware efficient solutions often exist, the dominance of the software systems has kept these solutions out of the spotlight. Among these hardware-efficient algorithms is a class of iterative solutions for trigonometric and other transcendental functions that use only shifts and adds to perform. The trigonometric functions are based on vector rotations, while other functions such as square root are implemented using an incremental expression of the desired function. The trigonometric algorithm is called CORDIC. The incremental functions are performed with a very simple extension to the hardware architecture and while not CORDIC in the strict sense, are often included because of the close similarity. The CORDIC algorithms generally produce one additional bit of accuracy for each iteration.
The trigonometric CORDIC algorithms were originally developed as a digital solution for real time navigation problems. The original work is credited to Jack Volder. The CORDIC algorithm has found its way into diverse applications including the 8087 math coprocessor, the HP-35 calculator, radar signal processors and robotics. CORDIC rotation has also been proposed for computing Discrete Fourier, Discrete Cosine, Singular Value Decomposition, and solving linear systems.
Transcendental functions are an essential requisite in a wide range of engineering application, such as image processing, manipulator kinematics, Digital Signal Processing (DSP), optimization processes, among others, in which a large number of trigonometric and/or exponential operations must be computed in an efficient manner using hardware/scientific calculator.
The basic concept of the CORDIC computation is to decompose the desired rotation angle into the weighted sum of a set of predefined elementary rotation angles such that the rotation through each of them can be accomplished with simple shift-and-add operations. All of the trigonometric functions can be computed or derived from functions using vector rotations. Vector rotation can also be used for polar to rectangular and rectangular to polar conversions, vector magnitude and as a building block in certain transforms such as DFT and DCT[1][2]. The CORDIC algorithm can perform the basic arithmetic operations (multiply, divide) as well as the evaluation of the basic trigonometric and hyperbolic functions by iterating a set of coupled equations that involve only additions, subtractions, and bit shifts.

(1)

(2)

(3)

The angle rotation can be executed in several steps, using an iterative process. Each step completes a small part of the rotation. Many steps will compose one planar rotation. A single step is defined by the following equation :

(4)
Above equation can be modified by eliminating the cos n factor, and the algorithm has been reduced to a few simple shifts and additions. The coefficient can be eliminated by pre-computing the final result. Converting cosine value into constant value with respect to . So making K in terms of Constant.

(5)
K is constant for all initial vectors and for all values of the rotation angle, it is normally referred to as the congregate constant. The derivative P (approx. 1.64676) is defined here. The coefficient K is pre-computed and taken into account at a later stage for radix 2. m is introducing with three mode of operations are circular, linear and Hyperbolic operations.

(6)

(7)
And Y value are the final result computed after n iterations, shown in figure 1,rotation of the vector.

(8)
For every step of the rotation Sn is computed as a sign of Zn.

From the above algorithm and the equations making CORDIC to compute all transcendental functions with better accuracy. For the different mode of operation, the selection, input and output are shown in table 1.
Or in a program-like style:
for i = 0 to n-1

From the table 1 we can compute all kind of transcendental functions also we can drive tangent value from sine and cosine value. The atan(2-n) is pre-calculated and stored in a table. Factor will not remain constant or predictable. For its compensation one may require complicated hardware structures (as shown in Figure 2.) or comparable post-processing cycles.
CORDIC can operate in two different modes: (i) rotation and (ii) vectoring where, the z or y variable respectively, is forced to zero through a series of iterations.
In rotation mode, the angle accumulator is initialized with the desired rotation angle. The rotation decision at each iteration is made to diminish the magnitude of residual angle in the angle accumulator. The decision at each iteration is based on the sign of the residual angle after each step.

This algorithm is commonly referred to as driving Z to zero. At that instant, X and Y value are cosine and sine value. The CORDIC core computes sine and cosine values:

(9)
There is a special case for driving Z to zero (Polar to Rectangular conversion).
In the vectoring mode, the CORDIC rotator rotates the input vector through whatever angle is necessary to align the result vector with the x-axis. The result of the vectoring operation is a rotation angle and the scaled magnitude of the original vector. The vectoring function wo rmine the rotation direction. If the angle accumulator is initialized with zero, it will contain the traversed angle at the end of iterations. This algorithm is commonly referred to as driving Y to zero (Rectangular to Polar conversion). The CORDIC core then computes magnitude and phase:

(10)
Each of these modes can be utilized in circular, linear and hyperbolic coordinate systems to compute various functions. The classical CORDIC approach suffers from five principal drawbacks: 1) The requirement of a scale factor compensation, 2)The magnitude restriction of the input variables, 3) Low speed of execution. 4) Quantization effect of CORDIC - Angle Quantization and 5) Standard representation of Input angle and output values in terms of IEEE Standards. Whereas the first drawback requires additional multiplication operations, the second drawback incurs a significant impact on the accuracy of the computed function since it depends on how closely the variables y (in vectoring mode) or z (in rotation mode) can be driven to zero. They can be driven close to zero if the initial inputs lie within a certain range called the 'range of convergence'(+/- 99.88). The third drawback comes from the iterative nature of the CORDIC algorithm. Fourth drawback based on the Angle approximation based on iterative nature, and the final drawback based on the input representation.
The principal aim of this work is to develop a CORDIC processor that is 1) power efficient, 2) free from the scale factor compensation problem, 3) convergence range over the entire coordinate space. 4) free from look up table (ROM)."More specifically, we concentrate on the rotation and vectoring mode of operation (i.e., Z to 0 or y to 0 respectively) in the Cartesian coordinate system."
The performance and time efficiency of the CORDIC methods can be investigated and evaluated by two methods based on the conventional CORDIC and Matlab simulation. First, determination of absolute error constraint method, thereby, the absolute error is given as a different expected error of a CORDIC computational result and a Matlab simulation result. Analysis of angle 0 to 90 degree with the step angle of 5 degree (as shown in Figure 3.) and the minimum error at iteration 16: 0.0000531. (as shown in Table 2.)
From the plot and table we can conclude that 16 iteration is enough to calculate the cosine and sine value for all the angles. As per the Conventional algorithm, when no of iteration increases, accuracy of the result will be increased, but in the angle sensors, maximum no of decimal point precision is 2. So we can look for 2 point decimal precision. For 2 point decimal precision, 16 iteration is more than enough to compute the trigonometric values with perfect accuracy.

In general,the initial iterations we will consider as atan(2-i) from the look up table.After initial Iterations 'k', from 'k+1' iterations, instead of taking angle from look up table,we can take 2-i as a angle, this is shifted version of iteration number.So that we can minimize the no of look up locations, so that we can reduce memory size.The reduced in memory size look up table will be efficient for calculator,to do basic computation of all elementary functions.For the k value and the angle error can be computed and the selecting the k value based on the error rate. For the comparison, the data to conclude the error rate from the table. When angle goes more than 5 iterations the shifted version of iteration number (2-i) is almost equal to the angle in radians from the look up table, instead of taking angle from look up we can take angle from the shifted iteration number.
For a example at iteration no 9, the angle from look up table (degree) 0.1119056770662746 for the equivalent angle in radian is 0.001953123 the shifter version of (2^-9) is 0.001953125. From that example we can conclude that the shifted version and the angle from look up will be the same after certain iteration. Analysis of the iteration can be done by the calculating the error rate between normal computation and the fine grain computation. In the normal computation the angle value and the fine grain value should be the same so that the angle error will be minimized. If the angle error will be more, the computed result will lead to miss match with normal computation. So that the angle error should be very less to get high accuracy result. After 5 iteration the error rate was minimized to 1.4635*10^-5(as shown in Table 3.). From the angle computation, the accuracy after 16 iteration was 4 decimal points, so that after 5 iteration, we can take value from shifted instead of LUT. The error calculation table was given below.
Memory access time need two cycles for search the data, tag check and the data read for the ROM. If we are taking angle from shifted version from the iteration number, one clock is enough to shift the result, every stage has shifter to shift. So no need of any extra computation and as well we can save memory size for handheld scientific calculator and the memory size comparison for the proposed architecture, total number of memory size in the conventional algorithm requires 576 bit for 16 iteration, but in proposed architecture we need only 180 bit of memory size, we can reduce memory size by more than thrice of the original size, this will be more applicable when we go for high precision. Memory calculation based on 2 bit tag size and 32 bit single precision floating point number for angle representation.

The performance analysis of the radix-2 CORDIC implementation has been done with RC Compiler-Cadence 6.1 ASIC design tool. The results are compared with different libraries. For 180nm technology slow_normal.lib library is used. For 90nm technology tcbn90gbwp7tlvttc0d88_ccs.lib is used. For 45nm technology is used. While technology shrinking worst path delay, area and net power goes down. Table 4 shows the synthesize results for a single stage radix-2 CORDIC architecture in circular rotational mode. Table 5 shows the synthesize results for a 16-stage radix-2 CORDIC architecture in circular rotational mode. When the design was synthesized, it was found that the maximum clock frequency was 37.7 MHz for 45nm technology. This is reasonable with the complexity required to implement floating point adders.
Design and analysis of the high precision high speed handheld scientific Calculator is presented in this paper. The computational precision of the proposed processor is similar to that of the conventional CORDIC processor. The hardware complexity of the proposed processor is lesser than the conventional CORDIC processor. However, this work reveals that the hardware cost of the proposed processor can be reduced significantly. The proposed processor consumes less power compared to the classical CORDIC since the number of actually required arithmetic operations is significantly reduced, so that calculator computation and the precision were improved.

毕业设计我帮你

基于硬件的CORDIC算法的高精度、高速手持科学计算器设计