Complex Thoughts

What is the complex in my 8590/95 or 9590/9595?  Look HERE

Type0   Type 1 (G-K)   Type 2 (H-L)  Type 3 (M)   Type 4 (N-Y)

Useless Trivia
16 Bit Busmasters
Will a -xxx Complex work in my 90 / 95?

OK, crankheads, here's a unscientific comparison of some complex performances. Some unusual results... 

Reason for the Processor Complex
   In the first PS/2* models, most components were integrated into the planar of the system. This severely limited upgrade options and upgrade flexibility. While one component was upgraded, for example the processor, the other components such as the I/O controller and the memory controller were not. This created combinations of fast and slow components, which created unbalanced systems. Unbalanced systems are not as efficient as balanced systems where every components¢ performance is matched against other components¢ performances. 
   With this in mind, the server key components have been grouped together on a separate card known as a processor complex. Now the processor is contained on a removable processor complex board, which also holds the processor/memory bus, the memory controller, DMA controller, and Micro Channel* bus interface. Placing the processor on a complex together with key components means that when a system is upgraded, balanced systems performance can be maintained. 
   IBM has provided an upgrade path for existing and future file servers that allows network design engineers to replace the system processor complex with a faster and more efficient system processor complex at a later date. This policy of upgrading allows the server to accommodate increased server CPU utilization without the need to buy a complete new machine. Within the processor complex there are many features that are capable of providing more efficient data transfer. They may consist of: 
· Cache 
· Dual Path to Memory 
· Two-Way Interleaved Memory Banks 
· 32-bit DMA Controller 
· 40MBps Data Streaming 

Complex Features
   The processor complex consists of the devices and features in the computer that perform logical operations and calculations, control access to memory, and manage data-transfer operations. The following devices and features make up the processor complex: 

o The microprocessor 
o The memory subsystem 
o The direct memory access (DMA) controller 

   If your computer contains a processor complex, it is connected to the system board by two 164-pin, 82-position connectors, known as the processor interface connection. The processor interface connection provides: 

o The Micro Channel interface, which allows data to be transferred between the processor complex and the adapters in the Micro Channel expansion slots. 

o The system board interface, which allows the transfer of data between the processor complex and devices on the system board, such as the parallel, serial, keyboard, and auxiliary-device ports (for an explanation of these devices, see Input/Output Connectors and Ports). 

o Two memory interfaces (Dual Path), systems M-Y, which the processor complex uses to read from and write to system memory. All access to system memory is through the memory controller in the processor complex. 



L1 or Processor Cache
      There are two levels of cache. The cache incorporated into the main system processor is known as Level 1 (L1) cache. The 486 incorporates a single 8KB cache (Overdrive chips can have 16KB). Pentiums have two 8KB caches, one for instructions and one for data. These caches act as temporary storage places for instructions and data obtained from slower, main memory. When a system uses data, it will be likely to use it again, and getting it from an on-chip cache is much faster than getting it from main memory. 

L2 Cache 

   The second level of cache, called second-level cache or Level 2 cache, provides additional high speed memory to the Level 1 cache. This additional cache memory works together with the cache memory native to the main processor (L1). If the processor cannot find what it needs in the processor cache (a first-level cache miss), it then looks in the additional cache memory. If it finds the code or data there (a second-level cache hit), the processor will use it, and continue. If the data is in neither of the caches, an access to planar memory must occur. (G, H, and L complexes do NOT have L2 cache, nor do they have a cache socket). 
   L2 cache can be accessed 5 to 10 times faster than standard memory. Cache memory uses Static Random Access Memory (SRAM) which is much faster than the Dynamic Random Access Memory (DRAM) used for system memory. SRAM is more expensive and requires more power, which is why it is not used for all memory. 



Memory Controller

   The memory controller is a device on the system board or processor board that controls access to system memory by the microprocessor and I/O devices. Registers in the memory controller contain information about the amount and type of memory that is installed in the computer. During a system reset, the power-on self-test (POST) routine writes this information into the registers. (For information about POST, see Power-On Self-Test (POST) and Upgradable BIOS.) 

The functions of the memory controller vary among PS/2 models. They can include: 

o Dual-bus capability, which allows the microprocessor to read from and write to system memory while a bus master is controlling the Micro Channel bus. (See the Three Types of Overlapped Access.

o Memory timing control, which coordinates data-transfer operations involving single inline memory modules that operate at different speeds. 

o Cache control, which ensures the validity of the contents of the cache.    The cache controller (or, in some PS/2 models, the memory controller) identifies the instructions and data that are most likely to be needed while a specific program (or part of a program) is running and copies them from system memory into the cache. During processing, as requirements change, the cache controller copies other data and instructions into the cache, replacing data and instructions that are no longer needed in the cache. Computer performance is improved each time the microprocessor finds what it needs in the cache (a cache hit). If it does not find what it needs in the cache (a cache miss), the cache controller must locate the data or instruction in system memory and copy it into the cache, while one or more wait states are imposed on the microprocessor. The cache controller manages the use of the cache so that the number of cache hits far exceeds the number of cache misses. 

   In some PS/2 models, the microprocessor has only a built-in level-1 cache, but it supports an optional 256KB level-2 cache. This 256KB cache option increases the amount of cache memory in the computer, which increases the probability of cache hits

o Bus-width allocation, which supports 8-, 16-, and 32-bit data-transfer operations. 

o Memory interleaving, which is a method of reducing the time the microprocessor has to wait for system memory to respond during memory I/O operations. 



Dual Path to Memory
   When bus masters were implemented on Micro Channel servers, it was found that there was often contention for memory access between the processor and the bus masters, and that the processor was being delayed waiting for bus masters to release the path into memory. The new design of the processor complexes addresses these issues by providing a dual-path into memory, effectively providing two paths to system memory, one from the processor and one from the Micro Channel. These two separate paths to system memory allow overlapping of processor and bus master cycles. (M-Y complexes

Three kinds of overlapped cycles can occur: 
· CPU reads to L2 cache simultaneously with bus master I/O 
o When the microprocessor is reading from or writing to its internal cache or to the optional 256KB (KB equals approximately 1000 bytes) cache, the bus master that is controlling the Micro Channel bus has exclusive access to system memory.  
· CPU reads to L2 cache simultaneously with bus master memory access 
o The microprocessor and the bus master that is controlling the Micro Channel bus can use the system memory at the same time, provided that they do not try to use the same memory locations. 
· CPU reads to memory simultaneously with bus master I/O 
o When a bus master is reading from or writing to an I/O device or an adapter in a Micro Channel expansion slot, the microprocessor has exclusive access to system memory. 

   Both processor and Micro Channel cycles are buffered into 16 byte blocks, further alleviating the contention for memory by reducing the frequency of the accesses. Implementing dual-path access to memory and the buffering of cycles can give a system throughput of up to three times that of a server without it. 
   In computers that do not have a dual bus, the microprocessor is the default master, which means that it has to wait until no other masters are controlling the Micro Channel bus before it can have access to system memory. 



Two-Way Interleaved Memory Banks
   Another performance advantage is gained when the processor is accessing memory in burst mode. Memory is split into two banks, and data or code is stored sequentially across these banks; for example addresses 0 and 2 are held in bank 1, and addresses 1 and 3 are held in bank 2. The reason for this arrangement is that when a 486 burst mode request is made, the accesses to memory will be sequential. When the memory controller detects such a burst request from, for example, bank 0, it also pre-fetches the next 32 bits of data from bank 1. This way, the processor is not kept waiting while the information is being retrieved from memory. 

DMA Controller
   The DMA controller is integrated into the processor board and manages all DMA data transfers. Transferring data between system memory and an I/O device requires two steps. Data goes from the sending device to the DMA controller and then to the receiving device. The microprocessor gives the DMA controller the location, destination, and amount of data that is to be transferred. Then the DMA controller transfers the data, allowing the microprocessor to continue with other processing tasks. 

  When a device needs to use the Micro Channel bus to send or receive data, it competes with all the other devices that are trying to gain control of the bus. This process is known as arbitration.  (For additional information, see Arbitration.)  The DMA controller does not arbitrate for control of the bus; instead, the I/O device that is sending or receiving data (the DMA slave) participates in arbitration. (For additional information about slaves, refer to Slaves.)  It is the DMA controller, however, that takes control of the bus when the central arbitration control point grants the DMA slave's  request. 
   DMA controllers are a dedicated unit with the ability to move data between system memory and a device on the Micro Channel. It is used by simple adapters, and also by the parallel and serial ports. Earlier versions of the Model 95 (G-L complexes) implemented a 24-bit DMA, limiting DMA memory transfers to below 16MB (whereas the 486 processor was able to address up to 4GB of memory). On 32-bit systems with more than 16MB of memory, this could cause problems if a DMA access was for memory above 16MB. The operating system could work around the problem by ensuring that DMA buffers were always below 16MB when a DMA transfer was done, but this imposes a performance penalty. 

   Direct memory access (DMA) is a method of transferring data between system memory and I/O devices without requiring intervention by the microprocessor. DMA is more efficient than programmed I/O, in which the microprocessor reads the data from the sending device and then writes it to the receiving device. In DMA data transfers, data can bypass the microprocessor as it moves between system memory and I/O devices. DMA improves computer performance because the microprocessor does not have to interrupt its processing activities to manage data transfers. 

40MBps Data Streaming
The 40MBps data streaming transfer (M through Y complexes) offers considerably improved I/O performance. As in many cases, blocks transferred to and from memory are stored in sequential addresses, so repeatedly sending the address for each four bytes is unnecessary. With data streaming transfer the initial address is sent, then the blocks of data are sent and it is then assumed that the data requests are sequential. 



SynchroStream Controller

   SynchroStream controllers use IBM's most advanced technology packaging to integrate 5 major chips (memory, I/O, DMA controllers, FIFO buffers, ECC logic) into a single chip with a RISC-like architecture. This technology allows the high-speed interconnects and large streaming pipes that form the SynchroStream engine to provide state-of-the-art performance. 
   The SynchroSteam controller synchronizes data traveling between major subsystems and allows it to stream in parallel, at full bandwidth, to each subsystem concurrently.  

   At the heart of the computer, data is moving continually between processor, cache, main memory and the Micro Channel. Typically there is a single path to memory, so fast devices like processors have to wait for much slower I/O devices, slowing down the performance of the entire system to the speed of the slowest device. The IBM SynchroStream controller was designed to overcome this problem. It synchronizes the operation of fast and slow devices and streams data to these devices to ensure all devices work at their data at their optimum levels of performance.  

   Synchrostream is an intelligent device in that it predicts what data the devices will need and loads it from memory before it is requested. When the device wants the data, it is presented to it from the IBM SynchroStream controller and the device can continue working immediately, as it does not have to wait for the data to be collected from memory. When devices are moving data into memory, the IBM SynchroStream controller holds the data, and writes it to memory when it is most efficient to do so. Since devices are not moving data to and from memory directly, but to the SynchroStream controller, each device has its own logical path to memory. Devices do not have to wait for other slower devices.  
    The SynchroStream engine operates by using a spinning valve that continuously forms different connections between pipes. Once a connection is made, data is streamed to the Micro Channel or processor at the highest possible rates. Parallel paths allow data to stream to multiple sources at the same time. The pipes even continue to stream after the connection is changed. Data is always streaming to the Micro Channel and processor, allowing them to operate at full bandwidth.  

   The IBM SynchroStream controller is located on the Pentium processor complexes, featured in the Server 95 and Server 95 Array systems. The implementation on the processor complex means that current PS/2 Server 95 and PS/2 Model 90 users can easily upgrade their machines to have IBM SynchroStream controller functions.  

Key advantages of the SynchroStream 
· Fast single chip implementation  
     Competitive designs are multi-chip and have the performance overhead of moving information between chips. SynchroStream technology provides a Zero Wait State Pentium implementation.  
· Intelligence  
        IBM SynchroStream is intelligent in that it predictively loads data from memory so that requesting devices are not kept waiting. In addition, writes to memory are stored within the IBM SynchroStream controller and written to memory to optimize memory utilization.  
· RISC-like architecture  
     Pipelines are used to move data in a fast, efficient manner between memory and the requesting device.  
· Stream data to Micro Channel devices  
    SynchroStream can stream data to Micro Channel devices at 40MBps.  
· Upgradable system implementation  
   Competitive system designs do not have the unique Upgradable Processor Complex design so you cannot upgrade to SynchroStream-like functions from earlier models.  


16 Bit Busmasters
     The 9595 used with 16-bit busmasters (for example, the PS/2 Micro Channel SCSI Adapter (#1005, 6451109)) that support 32-bits of addressing will cause system malfunction or potential loss of data when the user installs greater than 16MB of system memory. 

Complexes work in all 8590 / 8595 / 9590 / 9595 / 500
Any existing Model 90, Model 95, or PC Server 500 can be upgraded to a new Processor Complex. For example, Base 1 to Base 2 or Base 3 or Base 4; Base 2 to Base 4, etc. 

NOTE: The power supply in the Model 90 case is supposedly a little small for the DX50, P60, P66, and P90 complexes. And in addition, the air baffle in the Model 90 may have to be removed if a processor with a big heat sink or heatsink / fan combination is installed. BUT I have to wonder- it's rated for 215 Watts, it isn't THAT small..Hell, the 9577 PS is only 194 Watts. 

95A Planar Lacks Backward Compatability
   The 95A (dual serial/dual parallel) planar will NOT support other than a Type 4 complex. End of story. You will get a 174 error and nothing more. 

More Useless Trivia
Hi Louis ! 
 >Peter, what are the differences in the G, J, K, and M class boards? 

Obviously not only the speed ... 
   As far as I can see IBM followed different "evolutionary stages" with these boards. The first being presented was the 33MHz Type 1 (64F0198), which was offered as 64F0201 with only 25MHz almost at the same time. The 33MHz model had been the "top line" model with cache and all that. I got a "handwired" platform from Charles Lassiter, which has still 64F0198 printed on the *card* but a 25.00Mhz oscillator and "handstamped" ASICs. Looks a lot like a pre-production sample - and prooves that the 25MHz is derived from the 33MHz - not other way round. The earliest HRM for the 95 however (dated March 1990 IIRC) mentiones both of them. 
   Mentions "optional 256K cache" - which makes clear that no other board than 64F0198 and 64F0210 is meant. The 486DX2-33/66 board 92F0145 is then a much later developement out of the Non-SOD 64F0198 - intended to use Flash-BIOS, but not fully developed or supported. (That's the board with the odd bank-select jumper in the top/right corner) 
   (Ed. edit based on personal inspection of a 92F0048) The 92F0048  appears to be also based on the Non-SOD 64F0198, with a DX50 cpu, a 50MHz oscillator, and some decoding circuitry mounted in the area that on the SOD would go on. The matching 12nS cache module is 92F0050. 
   The smaller type-1 platforms had been offered to form "entry models" - focussed on the Mod. 90 (92F0065 - 486SX-20 and 92F0049 - 486DX-20). The 92F0065, which I call "Kiddies CPUs", is the only one which has a  487SX-presence" Jumper. 
   A totally different thread are the Type-2 platforms, which all base on the 92F0079. The type-2 platforms have been developed to make memory selection a bit easier - for the cost of some performance as a cost-efficient solution. 
   The type-3 platform of the -M- class has been intended for high-end servers: paired memory *and* ECC support. Only few Mod. 90 saw this platform as far as I know. Don't know if IBM ever offered it officially in  the 90. I remember having seen 2 or 3 Mod. 9590-AMF at a customer - but they had the [PA]-sticker close to the Serial-number decal ... which identifies them as "upgraded machines". 
   The -M- platform even survived the change from the 8595 to the 9595 (with the old planar and LED-planel however) along with the 92F0161 486DX2-25/50 -L- platform. Strange enough. 
   The final stage of the 486-line was reached with the Type-4 -N- class board 61G2343 - which was the precessor to the 5V-Pentium (P5) platform. This however is a totally new developement, at a time when 486 processors were already a bit dusted. 
   The Type-4 platforms are all very similar with the integrated Intel Cache chipset. I think there has been a lot experience used from the -M- class DX-50 board. But this time everything fits on *one* printboard and make the funny shielded hi-density connectors obsolete for the "second 
floor" printboard. 

>Are there any ECAs related to any specific FRUs? 
   No. The "classical" -K-type board went through several "technical  changes without notice" and that was all (S.O.D. - still unclear which chip should fit in there ...). 
   So did the "92F0079-family" Type-2 boards. IBM never announced any technical changes on these boards. As far as I can recall there was no common ECA at anytime on any of the processorboard. Only "withdrawn from marketing" notices ... 
   There had been some traffic on the IBM BBS in 1991 - 1993 when people found out that they could not use over 1GB harddisks with the Type 1 & 2 platforms, but IBM offered the upgrade Eproms (for free !) until the deadline of December 1992, which was then stretched out to December 1993 at last. 

Unsorted Oorts
       When installed with the 486SX/25 Processor Upgrade Option, 16-bit bus masters (for example, PS/2 Micro Channel SCSI Adapter (#1005, 6451109)) that support 32-bits of addressing will cause system malfunction and/or potential loss of data when the user installs greater than 16MB of system memory. 



Can't See >16MB Under W95
   On an IBM PS/2 Model 77, 90 or 95 computer with more than 16 MB of memory (RAM) installed, Windows 95 only recognizes and uses 16 MB of memory. This is because HIMEM.SYS, the extended memory manager, only detects 16 MB of memory installed. 
CAUSE    These computers use a nonstandard API for reporting memory in excess of 16 MB that is not supported  by Windows 95. 

RESOLUTION  This issue is resolved by the following updated file(s) for Windows 95 and OSR2, and later versions of these file(s): HIMEM.SYS  ver 3.95  dated 10/2/95  33,127 bytes 
  This and later versions of HIMEM.SYS support the API used by the computers listed in this article for reporting memory in excess of 16 MB. 

STATUS   Microsoft has confirmed this to be a problem in Microsoft Windows 95. An update to address this problem is now available, but is not fully regression tested and should be applied only to computers experiencing this specific problem. Unless you are severely impacted by this specific problem, Microsoft does notrecommend implementing this update at this time. Contact Microsoft Technical Support for additional information about the availability of this update. 
  This issue is resolved in Microsoft Windows 98. 

MORE INFORMATION   The computers listed in this article use INT 15 ax=c700h to report memory above 16 MB. (Other IBM PS/2 Microchannel computers may use INT 15 ax=E881.) The updated version of Himem.sys accepts a /P switch that causes HIMEM to use this API to detect memory in excess of 16 MB. Without the /P switch, the updated HIMEM does not use this API, and functions the same as the shipping version of HIMEM. 

  The new version of Himem.sys reports itself as version 3.95, the same as the shipping version. Knowledge Base Reference Article: Q137755: No More Than 16 MB of Memory Reported on IBM PS/2 Model 77, 90 

Sorry, I can't find it on M$. email ME for a copy of HIMEMUPD.EXE
 
 

9595 Main Page