[INTEL NAVIGATION HEADER]

Benchmark Report: 8XC251SB Relative Performance over the 8XC51FB

(#2524) Benchmark Report: 8XC251SB Relative Performance over the 8XC51FB

Benchmark Report: 8XC251SB Relative Performance over the 8XC51FB

Note: the complete report is located on the Intel BBS with the filename 251BENCH.EXE (Self-Extracting, Word 6.0 format). It contains addtional results as well as code examples of the tests performed. You can call the BBS at 916-356-3600.

OBJECTIVE

The purpose of this benchmark was to check the performance of the CPU power of 8XC251SB vs. 8XC51FX with different programs and different hardware.

SUMMARY OF BENCHMARK

A total of 8 programs were written for four experiments: four in pure 51 instructions and four in optimized 251 instructions. Out of the four types of programs, the first three programs are 64 bytes Data Transfer, Multiply and Accumulation, and 3x3 Matrix Multiplication. The last program is a combination of the first three programs to acquire the overall performance of the microcontrollers. The source for the first program, which performs the 64 bytes of Data Transfer, is included in Appendix A.

All programs were assembled in two modes: binary mode and source mode. Programs with 51 instructions assembled in binary mode are 8XC51FX compatible and can be run by all units. Programs with 51 instructions assembled in source mode and programs with 251 instructions in both binary and source mode can only be run by 87C251SB microcontrollers.

Experiment 1, 2 and 3 will be run on the EV80C51FX evaluation board while experiment 4 was conducted on a the 8XC51FX target board.

APPARATUS

Experiment 1, 2 and 3

Experiment 4

PROCEDURE

EXPERIMENT 1

This experiment compares the processing speed of the 8XC51FX to the 8XC251SB by emulating data transfer from internal code memory to external data memory. The programs used are shown below:

The flow of the programs is shown below:

In every loop of TMAC11B, TMAC11S, TMAC12B and TMAC12S, the following tasks were performed:

TaskInstruction Type
a.Loop 3825 times. In each loop move 64 bytes of constant data from
internal code memory to external data memory.
CPU
b.Flashing the LED through port 1.I/O

The results of the benchmarking are shown as follows:

A) MCS 51 COMPATIBILITY

TMAC11B is written in 100% 51 instructions and assembled in Binary Mode

TMAC11S is written in 100% 51 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC11B3:403:403:403:401x
2
87C251SBBinExt0non-pageTMAC11B1:181:181:181:182.82x
3
87C251SBBinExt1non-pageTMAC11B1:511:511:511:511.98x
4
87C251SBSrcExt0non-pageTMAC11S1:421:421:421:422.16x
5
87C251SBSrcExt1non-pageTMAC11S2:272:272:272:271.50x
687C251SBBinInt0non-pageTMAC11B0:440:440:440:445x
787C251SBSrcInt0non-pageTMAC11S0:570:570:570:573.86x

TABLE 1.1

B) MCS 251 OPTIMIZATION

TMAC11B is written in 100% 51 instructions and assembled in Binary Mode

TMAC12B is written in optimised 251 instructions and assembled in Binary Mode

TMAC12S is written in optimised 251 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC11B3:403:403:403:401x
2
87C251SBBinExt0non-pageTMAC12B0:310:310:310:317.10x
3
87C251SBBinExt1non-pageTMAC12B0:430:430:430:435.12x
4
87C251SBSrcExt0non-pageTMAC12S0:270:270:270:278.15x
5
87C251SBSrcExt1non-pageTMAC12S0:370:370:370:375.96x
687C251SBBinInt0non-pageTMAC12B0:140:140:140:14
15.71x
787C251SBSrcInt0non-pageTMAC12S0:120:120:120:12
18.33x

TABLE 1.2

EXPERIMENT 2

This experiment compares the processing speed of the 8XC51FX to the 8XC251SB by performing Multiplication and Accumulation (MAC) routines on 16 bits signed integer with 32 bits results. The programs used are shown below:

The flow of the programs is shown below:

In every loop of TMAC21B, TMAC21S, TMAC22B and TMAC22S, the following tasks were performed:

TaskInstruction Type
a.Loop 65,025 times a 16 bit Multiplication and Accumulation (MAC)CPU
b.Flashing the LED through port 1.I/O

The results of the benchmarking are shown as follows:

A) MCS 51 COMPATIBILITY

TMAC21B is written in 100% 51 instructions and assembled in Binary Mode

TMAC21S is written in 100% 51 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC21B4:174:174:174:171x
2
87C251SBBinExt0non-pageTMAC21B1:421:411:421:422.52x
3
87C251SBBinExt1non-pageTMAC21B2:292:292:292:291.72x
4
87C251SBSrcExt0non-pageTMAC21S2:122:122:122:121.95x
5
87C251SBSrcExt1non-pageTMAC21S3:153:153:153:151.32x
687C251SBBinInt0non-pageTMAC21B0:570:570:570:574.51x
787C251SBSrcInt0non-pageTMAC21S1:121:121:121:123.57x

TABLE 2.1

B) MCS 251 OPTIMIZATION

TMAC21B is written in 100% 51 instructions and assembled in Binary Mode

TMAC22B is written in optimised 251 instructions and assembled in Binary Mode

TMAC22S is written in optimised 251 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC21B4:174:174:174:171x
2
87C251SBBinExt0non-pageTMAC22B1:09
1:09
1:09
1:09
3.72x
3
87C251SBBinExt1non-pageTMAC22B1:391:391:391:392.60x
4
87C251SBSrcExt0non-pageTMAC22S1:001:001:001:004.28x
5
87C251SBSrcExt1non-pageTMAC22S1:251:251:251:253.02x
687C251SBBinInt0non-pageTMAC22B0:360:360:360:367.14
787C251SBSrcInt0non-pageTMAC22S0:320:320:320:328.03x

TABLE 2.2

EXPERIMENT 3

This experiment compares the processing speed of the 8XC51FX and 8XC251SB microcontrollers of performing 3x3 Matrix Multiplication on 16 bit signed integers with 32 bit results. The programs used are shown below:

The flow of the programs is shown below:

In every loop of TMAC31B, TMAC31S, TMAC32B and TMAC32S, the following tasks were performed:

TaskInstruction Type
a.Loop 3825 times the 16 bit 3 x 3 Matrix MultiplicationCPU
b.Flashing the LED through port 1.I/O

The results of the benchmarking are shown as follows:

A) MCS 51 COMPATIBILITY

TMAC31B is written in 100% 51 instructions and assembled in Binary Mode

TMAC31S is written in 100% 51 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC31B6:586:586:596:581x
2
87C251SBBinExt0non-pageTMAC31B3:413:413:413:411.89x
3
87C251SBBinExt1non-pageTMAC31B5:255:255:265:251.29x
4
87C251SBSrcExt0non-pageTMAC31S3:413:413:413:411.89x
5
87C251SBSrcExt1non-pageTMAC31S5:255:255:255:251.29x
687C251SBBinInt0non-pageTMAC31B2:002:002:002:003.48x
787C251SBSrcInt0non-pageTMAC31S2:012:012:012:013.45x

TABLE 3.1

B) MCS 251 OPTIMIZATION

TMAC31B is written in 100% 51 instructions and assembled in Binary Mode

TMAC32B is written in optimised 251 instructions and assembled in Binary Mode

TMAC32S is written in optimised 251 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Ext--TMAC31B6:586:586:596:581x
2
87C251SBBinExt0non-pageTMAC32B0:590:590:590:597.08x
3
87C251SBBinExt1non-pageTMAC32B1:231:231:231:235.04x
4
87C251SBSrcExt0non-pageTMAC32S0:530:530:530:537.89x
5
87C251SBSrcExt1non-pageTMAC32S1:131:131:131:135.73x
687C251SBBinInt0non-pageTMAC32B0:350:350:350:3511.94x
787C251SBSrcInt0non-pageTMAC32S0:320:320:320:3213.06x

TABLE 3.2

Experiment 4

This experiment is to compares the processing speed of the 8XC51FX and 8XC251SB microcontrollers on performing a combined program consisting of emulation of Data transfer from internal code memory to external data memory, Multiplication and Accumulation (MAC) and 3x3 Matrix Multiplication on 16 bits signed integer. Experiment 4 was done with a different hardware: the 8XC51FX target board with page mode capability was used to test the performance of the new feature in 87C251SB. The programs used are shown below:

The flow of the TMAC1 is shown below:

In every loop of TMAC1, the following tasks were performed:

TaskInstruction Type
a.Loop 3825 times moving 64 bytes of constant data from internal code memory to external data memory
CPU
b.Loop 65,025 times a 16 bit Multiplication and Accumulation (MAC)CPU
c.Loop 3825 times a 16 bit 3 x 3 matrix MultiplicationCPU
d.Flashing the LED through port 1I/O

The results of the benchmarking are shown as follows:

A) MCS 51 COMPATIBILITY

TMAC1 is written in 100% 51 instructions and assembled in Binary Mode

TMAC2 is written in 100% 51 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Int--TMAC114:5314:5314:5314:531x
2
87C51FB-Ext--TMAC114:5314:5314:5314:531x
5
87C251SBBinExt0non-pageTMAC16:406:406:406:402.23x
6
87C251SBBinExt1non-pageTMAC19:459:459:459:451.53x
7
87C251SBSrcExt0non-pageTMAC27:357:357:357:351.96x
8
87C251SBSrcExt1non-pageTMAC211:0711:0711:0711:071.34x
9
87C251SBBinExt0pageTMAC14:064:064:054:063.63x
10
87C251SBBinExt1pageTMAC16:576:576:576:572.14x
11
87C251SBSrcExt0pageTMAC24:354:354:354:353.25x
12
87C251SBSrcExt1pageTMAC27:527:517:517:511.90x
13
87C251SBBinInt0non-pageTMAC13:413:413:413:414.04x
14
87C251SBSrcInt0non-pageTMAC24:104:104:104:103.57x

Table 4.1

B) MCS 251 OPTIMIZATION

TMAC1 is written in 100% 51 instructions and assembled in Binary Mode

TMAC7 is written in optimised 251 instructions and assembled in Binary Mode

TMAC8 is written in optimised 251 instructions and assembled in Source Mode

Unit DeviceModeMemW/sPageProgTime (Min:Sec)AveRatio to FX
123
1
87C51FB-Int--TMAC114:5314:5314:5314:531x
2
87C51FB-Ext--TMAC114:5314:5314:5314:531x
3
87C251FBBinExt0non-pageTMAC72:342:342:342:345.80x
4
87C251FBBinExt1non-pageTMAC73:393:393:393:394.08x
5
87C251FBSrcExt0non-pageTMAC82:212:212:212:216.33x
6
87C251FBSrcExt1non-pageTMAC83:173:183:173:174.53x
7
87C251FBBinExt0pageTMAC71:431:431:431:438.67x
8
87C251FBBinExt1pageTMAC72:382:392:382:385.65x
9
87C251FBSrcExt0pageTMAC81:361:361:361:369.30x
10
87C251FBSrcExt1pageTMAC82:252:252:252:256.16x
11
87C251FBBinInt0non-pageTMAC71:331:331:331:339.60x
12
87C251FBSrcInt0non-pageTMAC81:261:261:261:2610.38x

Table 4.2

FINAL CONCLUSION

All the tests were mainly exercised the multiplication, addition, moving and branching instructions. It may not show the most optimised code, but it will give a brief idea on the actual performance of the microcontrollers.

Overall, the test programs in experiment 1, 2, 3 and 4 consist of small programs looping multiple times and they may not show the ideal performance of the 87C251SB microcontroller. In real life, larger programs will less loops will be used and the performance of the 87C251SB microcontroller will be better.

Free Web Hosting



Legal Stuff © 1997 Intel Corporation