|
![]() |
ISSN 1214-9675 Server vznikl za podpory Grantové agentury ČR. 21. ročník |
Témata
Doporučujeme
Kontakt
|
![]()
Vydáno dne 13. 05. 2010 (11147 přečtení) |
![]() | (1) |
Last algorithm used for compiler efficiency comparison was an implementation of the edge detector [4] (enhanced 2D linear filtration – 2D convolution), detecting rising and falling edges horizontally and vertically orientated according to the following basic convolution formula:
![]() | (2) |
The same piece of code was compiled using makefile described in section 2 (GNU tool-chain) and using native VisualDSP++ development tool. Both codes were running on the same hardware under the same conditions. Processor was operated at the core frequency 400 MHz with external data memory running on 100 MHz. The obtained results (execution times of described algorithms) are listed below.
Fig. 3 Benchmark results in non OS environment
GNU tool-chain is mainly used for compilation of code that runs under operating system Linux, so it seems to be interesting to test the performance under these conditions.
Target hardware and operating systemAs the target hardware for benchmarks with the use of the operating system a Blackfin® video processing evaluation board was used (Figure 4). Used hardware is based on DSP ADSP-BF532. The main board is equipped with 32Mb SDRAM, USB2.0 controller, 10/100Mbit Ethernet controller and SD/MMC expansion slot. The core clock is 400 MHz and the system clock 133 MHz. Used operating system was uCLinux version 2.6.19-ADI- 2007R1.1. Code compiled using VDSP++ was tested on the same board and configuration. Benchmark results are presented below.
Fig. 4 uCLinux video processing evaluation board block diagram
Benchmark resultsFig. 5 Image processing code running under OS uCLinux vs. VDSP++ code execution times
According to the obtained benchmark results (Figure 5) the code running under uCLinux and compiled using GNU tool-chain seems to be faster than the code compiled with VisualDSP++. Interpretation of this phenomenon could be that operating system uCLinux uses internal CACHE memory by default, so the external memory access is more effective and consequently faster. Code compiled under VisualDSP++ was not programmed with the support of CACHE and delays caused by external memory accesses are much longer than delays caused by running operating system services in multi-thread uCLinux.
The main goal of this work was to find an alternative (free) solution for Blackfin® processors code compilation. One of the possible alternatives is the GNU tool-chain, which is issued under GPL license and is free. Main question about the necessity of operating system for GNU compiled code run was answered and the way how to use it without the need of operating system uCLinux was presented.
The second important parameter in signal processing applications is the code execution time. Comparison between the commercial product – native compiler VisualDSP++ (Analog Devices, Inc.) and suggested GNU tool-chain final code efficiency was presented in section 3 (Figure 3). General conclusion of this performance tests is that for chosen algorithms the GNU tool-chain produces less effective code than VisualDSP++ when running in non-OS environment. One of the very important facts that affect the result is the absence of hardware loops implementation support in the GNU tool-chain.
In general it is possible to use GPL alternative tools (GNU tool-chain) for development of stand-alone Blackfin® applications that are not running under any operating system. If application is time critical it is better to use native development tools. GNU tools are optimized for execution under operating system, so it is better to first implement the appropriate version of uCLinux for the developed application. Operating system in embedded application will enhance its options (the price for higher hardware complexity is reasonable) and GNU tools will work more effectively. For example the benchmark algorithms presented in section 3 were compiled for execution under uCLinux and run on a simple Blackfin® embedded hardware (section 4). Operating conditions were almost the same (processor model, core frequency). Execution times of several algorithms were fully comparable with results obtained for code compiled using VDSP++ (Figure 5). Explanation of this phenomenon can be in the use of internal processors CACHE. Operating system and all applications running under it are utilizing all CACHE advantages automatically. In presented performance tests, CACHE was omitted.
Based on all acquired results, in general, it is possible to use benchmarked GNU-tools for development of signal processing applications not only in OS environments, but also in non-OS environment (with lower compiled code performance).
Research described was supported by research program No. MSM6840770015 “Research Methods and Systems for Measurement of Physical Quantities and Measured Data Processing” of the CTU in Prague, sponsored by the Ministry of Education, Youth and Sports of the Czech Republic and by Czech Agency Grant GA 102/09/H082. Development tools donated by Analog Devices Inc. were used in this work.
[1] main [Blackfin Linux Docs], [Internet] 28.3.2009, http://docs.blackfin.uclinux.org/
[2] Das U-Boot for the Blackfin Processor, [Internet] 28.3.2009, http://blackfin.uclinux.org/gf/project/u-boot
[3] uClibc, [Internet] 28. 3. 2009, http://www.uclibc.org/
[4] Gonzalez R. C., Woods R. E., “Digital Image Processing”, Ed.3,Prentice Hall, 2007
[5] M. Egmont-Petersen, JHC. Reiber, “Accurate object localization in
gray level images using the center of gravity measure: Accuracy versus
precision”, IEEE Transactions on Image Processing, 2002, Vol
11, Issue 12, pp 1379- 1384, doi: 10.1109/TIP.2002.806250
Tento web site byl vytvořen prostřednictvím phpRS - redakčního systému napsaného v PHP jazyce.
Na této stránce použité názvy programových produktů, firem apod. mohou být ochrannými známkami
nebo registrovanými ochrannými známkami příslušných vlastníků.