Note: the complete report is located on the Intel BBS with the filename 251BENCH.EXE (Self-Extracting, Word 6.0 format). It contains addtional results as well as code examples of the tests performed. You can call the BBS at 916-356-3600.
All programs were assembled in two modes: binary mode and source mode. Programs with 51 instructions assembled in binary mode are 8XC51FX compatible and can be run by all units. Programs with 51 instructions assembled in source mode and programs with 251 instructions in both binary and source mode can only be run by 87C251SB microcontrollers.
Experiment 1, 2 and 3 will be run on the EV80C51FX evaluation board while experiment 4 was conducted on a the 8XC51FX target board.
This experiment compares the processing speed of the 8XC51FX to the 8XC251SB by emulating data transfer from internal code memory to external data memory. The programs used are shown below:
Task | Instruction Type | |
a. | Loop 3825 times. In each loop move 64 bytes of constant data from internal code memory to external data memory. | CPU |
b. | Flashing the LED through port 1. | I/O |
The results of the benchmarking are shown as follows:
TMAC11B is written in 100% 51 instructions and assembled in Binary Mode
TMAC11S is written in 100% 51 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC11B | 3:40 | 3:40 | 3:40 | 3:40 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC11B | 1:18 | 1:18 | 1:18 | 1:18 | 2.82x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC11B | 1:51 | 1:51 | 1:51 | 1:51 | 1.98x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC11S | 1:42 | 1:42 | 1:42 | 1:42 | 2.16x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC11S | 2:27 | 2:27 | 2:27 | 2:27 | 1.50x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC11B | 0:44 | 0:44 | 0:44 | 0:44 | 5x |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC11S | 0:57 | 0:57 | 0:57 | 0:57 | 3.86x |
TABLE 1.1
TMAC11B is written in 100% 51 instructions and assembled in Binary Mode
TMAC12B is written in optimised 251 instructions and assembled in Binary Mode
TMAC12S is written in optimised 251 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC11B | 3:40 | 3:40 | 3:40 | 3:40 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC12B | 0:31 | 0:31 | 0:31 | 0:31 | 7.10x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC12B | 0:43 | 0:43 | 0:43 | 0:43 | 5.12x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC12S | 0:27 | 0:27 | 0:27 | 0:27 | 8.15x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC12S | 0:37 | 0:37 | 0:37 | 0:37 | 5.96x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC12B | 0:14 | 0:14 | 0:14 | 0:14 | 15.71x |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC12S | 0:12 | 0:12 | 0:12 | 0:12 | 18.33x |
TABLE 1.2
This experiment compares the processing speed of the 8XC51FX to the 8XC251SB by performing Multiplication and Accumulation (MAC) routines on 16 bits signed integer with 32 bits results. The programs used are shown below:
Task | Instruction Type | |
a. | Loop 65,025 times a 16 bit Multiplication and Accumulation (MAC) | CPU |
b. | Flashing the LED through port 1. | I/O |
The results of the benchmarking are shown as follows:
TMAC21B is written in 100% 51 instructions and assembled in Binary Mode
TMAC21S is written in 100% 51 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC21B | 4:17 | 4:17 | 4:17 | 4:17 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC21B | 1:42 | 1:41 | 1:42 | 1:42 | 2.52x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC21B | 2:29 | 2:29 | 2:29 | 2:29 | 1.72x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC21S | 2:12 | 2:12 | 2:12 | 2:12 | 1.95x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC21S | 3:15 | 3:15 | 3:15 | 3:15 | 1.32x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC21B | 0:57 | 0:57 | 0:57 | 0:57 | 4.51x |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC21S | 1:12 | 1:12 | 1:12 | 1:12 | 3.57x |
TABLE 2.1
TMAC21B is written in 100% 51 instructions and assembled in Binary Mode
TMAC22B is written in optimised 251 instructions and assembled in Binary Mode
TMAC22S is written in optimised 251 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC21B | 4:17 | 4:17 | 4:17 | 4:17 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC22B | 1:09 | 1:09 | 1:09 | 1:09 | 3.72x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC22B | 1:39 | 1:39 | 1:39 | 1:39 | 2.60x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC22S | 1:00 | 1:00 | 1:00 | 1:00 | 4.28x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC22S | 1:25 | 1:25 | 1:25 | 1:25 | 3.02x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC22B | 0:36 | 0:36 | 0:36 | 0:36 | 7.14 |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC22S | 0:32 | 0:32 | 0:32 | 0:32 | 8.03x |
TABLE 2.2
This experiment compares the processing speed of the 8XC51FX and 8XC251SB microcontrollers of performing 3x3 Matrix Multiplication on 16 bit signed integers with 32 bit results. The programs used are shown below:
Task | Instruction Type | |
a. | Loop 3825 times the 16 bit 3 x 3 Matrix Multiplication | CPU |
b. | Flashing the LED through port 1. | I/O |
The results of the benchmarking are shown as follows:
TMAC31B is written in 100% 51 instructions and assembled in Binary Mode
TMAC31S is written in 100% 51 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC31B | 6:58 | 6:58 | 6:59 | 6:58 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC31B | 3:41 | 3:41 | 3:41 | 3:41 | 1.89x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC31B | 5:25 | 5:25 | 5:26 | 5:25 | 1.29x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC31S | 3:41 | 3:41 | 3:41 | 3:41 | 1.89x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC31S | 5:25 | 5:25 | 5:25 | 5:25 | 1.29x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC31B | 2:00 | 2:00 | 2:00 | 2:00 | 3.48x |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC31S | 2:01 | 2:01 | 2:01 | 2:01 | 3.45x |
TABLE 3.1
TMAC31B is written in 100% 51 instructions and assembled in Binary Mode
TMAC32B is written in optimised 251 instructions and assembled in Binary Mode
TMAC32S is written in optimised 251 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Ext | - | - | TMAC31B | 6:58 | 6:58 | 6:59 | 6:58 | 1x |
2 | 87C251SB | Bin | Ext | 0 | non-page | TMAC32B | 0:59 | 0:59 | 0:59 | 0:59 | 7.08x |
3 | 87C251SB | Bin | Ext | 1 | non-page | TMAC32B | 1:23 | 1:23 | 1:23 | 1:23 | 5.04x |
4 | 87C251SB | Src | Ext | 0 | non-page | TMAC32S | 0:53 | 0:53 | 0:53 | 0:53 | 7.89x |
5 | 87C251SB | Src | Ext | 1 | non-page | TMAC32S | 1:13 | 1:13 | 1:13 | 1:13 | 5.73x |
6 | 87C251SB | Bin | Int | 0 | non-page | TMAC32B | 0:35 | 0:35 | 0:35 | 0:35 | 11.94x |
7 | 87C251SB | Src | Int | 0 | non-page | TMAC32S | 0:32 | 0:32 | 0:32 | 0:32 | 13.06x |
TABLE 3.2
This experiment is to compares the processing speed of the 8XC51FX and 8XC251SB microcontrollers on performing a combined program consisting of emulation of Data transfer from internal code memory to external data memory, Multiplication and Accumulation (MAC) and 3x3 Matrix Multiplication on 16 bits signed integer. Experiment 4 was done with a different hardware: the 8XC51FX target board with page mode capability was used to test the performance of the new feature in 87C251SB. The programs used are shown below:
Task | Instruction Type | |
a. | Loop 3825 times moving 64 bytes of constant data from internal code memory to external data memory | CPU |
b. | Loop 65,025 times a 16 bit Multiplication and Accumulation (MAC) | CPU |
c. | Loop 3825 times a 16 bit 3 x 3 matrix Multiplication | CPU |
d. | Flashing the LED through port 1 | I/O |
The results of the benchmarking are shown as follows:
TMAC1 is written in 100% 51 instructions and assembled in Binary Mode
TMAC2 is written in 100% 51 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Int | - | - | TMAC1 | 14:53 | 14:53 | 14:53 | 14:53 | 1x |
2 | 87C51FB | - | Ext | - | - | TMAC1 | 14:53 | 14:53 | 14:53 | 14:53 | 1x |
5 | 87C251SB | Bin | Ext | 0 | non-page | TMAC1 | 6:40 | 6:40 | 6:40 | 6:40 | 2.23x |
6 | 87C251SB | Bin | Ext | 1 | non-page | TMAC1 | 9:45 | 9:45 | 9:45 | 9:45 | 1.53x |
7 | 87C251SB | Src | Ext | 0 | non-page | TMAC2 | 7:35 | 7:35 | 7:35 | 7:35 | 1.96x |
8 | 87C251SB | Src | Ext | 1 | non-page | TMAC2 | 11:07 | 11:07 | 11:07 | 11:07 | 1.34x |
9 | 87C251SB | Bin | Ext | 0 | page | TMAC1 | 4:06 | 4:06 | 4:05 | 4:06 | 3.63x |
10 | 87C251SB | Bin | Ext | 1 | page | TMAC1 | 6:57 | 6:57 | 6:57 | 6:57 | 2.14x |
11 | 87C251SB | Src | Ext | 0 | page | TMAC2 | 4:35 | 4:35 | 4:35 | 4:35 | 3.25x |
12 | 87C251SB | Src | Ext | 1 | page | TMAC2 | 7:52 | 7:51 | 7:51 | 7:51 | 1.90x |
13 | 87C251SB | Bin | Int | 0 | non-page | TMAC1 | 3:41 | 3:41 | 3:41 | 3:41 | 4.04x |
14 | 87C251SB | Src | Int | 0 | non-page | TMAC2 | 4:10 | 4:10 | 4:10 | 4:10 | 3.57x |
Table 4.1
TMAC1 is written in 100% 51 instructions and assembled in Binary Mode
TMAC7 is written in optimised 251 instructions and assembled in Binary Mode
TMAC8 is written in optimised 251 instructions and assembled in Source Mode
Unit | Device | Mode | Mem | W/s | Page | Prog | Time (Min:Sec) | Ave | Ratio to FX | ||
1 | 2 | 3 | |||||||||
1 | 87C51FB | - | Int | - | - | TMAC1 | 14:53 | 14:53 | 14:53 | 14:53 | 1x |
2 | 87C51FB | - | Ext | - | - | TMAC1 | 14:53 | 14:53 | 14:53 | 14:53 | 1x |
3 | 87C251FB | Bin | Ext | 0 | non-page | TMAC7 | 2:34 | 2:34 | 2:34 | 2:34 | 5.80x |
4 | 87C251FB | Bin | Ext | 1 | non-page | TMAC7 | 3:39 | 3:39 | 3:39 | 3:39 | 4.08x |
5 | 87C251FB | Src | Ext | 0 | non-page | TMAC8 | 2:21 | 2:21 | 2:21 | 2:21 | 6.33x |
6 | 87C251FB | Src | Ext | 1 | non-page | TMAC8 | 3:17 | 3:18 | 3:17 | 3:17 | 4.53x |
7 | 87C251FB | Bin | Ext | 0 | page | TMAC7 | 1:43 | 1:43 | 1:43 | 1:43 | 8.67x |
8 | 87C251FB | Bin | Ext | 1 | page | TMAC7 | 2:38 | 2:39 | 2:38 | 2:38 | 5.65x |
9 | 87C251FB | Src | Ext | 0 | page | TMAC8 | 1:36 | 1:36 | 1:36 | 1:36 | 9.30x |
10 | 87C251FB | Src | Ext | 1 | page | TMAC8 | 2:25 | 2:25 | 2:25 | 2:25 | 6.16x |
11 | 87C251FB | Bin | Int | 0 | non-page | TMAC7 | 1:33 | 1:33 | 1:33 | 1:33 | 9.60x |
12 | 87C251FB | Src | Int | 0 | non-page | TMAC8 | 1:26 | 1:26 | 1:26 | 1:26 | 10.38x |
Table 4.2
All the tests were mainly exercised the multiplication, addition, moving and branching instructions. It may not show the most optimised code, but it will give a brief idea on the actual performance of the microcontrollers.
Overall, the test programs in experiment 1, 2, 3 and 4 consist of small programs looping multiple times and they may not show the ideal performance of the 87C251SB microcontroller. In real life, larger programs will less loops will be used and the performance of the 87C251SB microcontroller will be better.
Legal Stuff © 1997 Intel Corporation