US20120324456A1 - Managing nodes in a high-performance computing system using a node registrar - Google Patents

Managing nodes in a high-performance computing system using a node registrar Download PDF

Info

Publication number
US20120324456A1
US20120324456A1 US13/162,130 US201113162130A US2012324456A1 US 20120324456 A1 US20120324456 A1 US 20120324456A1 US 201113162130 A US201113162130 A US 201113162130A US 2012324456 A1 US2012324456 A1 US 2012324456A1
Authority
US
United States
Prior art keywords
node
subsystem
management
nodes
registrar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/162,130
Inventor
Gregory Wray Teather
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/162,130 priority Critical patent/US20120324456A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TEATHER, GREGORY WRAY
Publication of US20120324456A1 publication Critical patent/US20120324456A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Priority to US14/741,807 priority patent/US9747130B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/505Clust
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/465Distributed object oriented systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers

Definitions

  • High-performance computing or cluster computing is increasingly used for a large number of computationally intense tasks, such as webscale data mining, machine learning, network traffic analysis, and various engineering and scientific tasks.
  • jobs may be scheduled to execute concurrently on a computing cluster in which application data is stored on multiple compute nodes.
  • a node registrar subsystem is disclosed that, according to one embodiment, is implemented as a service and a database, and acts as a central repository for information about all nodes within an HPC system.
  • the node registrar subsystem formalizes data sharing between the HPC subsystems, and allows interaction with heterogeneous subsystems: different types of management, job scheduler, and monitoring solutions.
  • the node registrar subsystem also facilitates scale-out of both management infrastructure and the job scheduler by delegating responsibility of different nodes to different sub-system instances.
  • One embodiment is directed to a method of managing nodes in a high-performance computing (HPC) system, which includes a management subsystem and a job scheduler subsystem.
  • the method includes providing a node registrar subsystem.
  • Logical node management functions are performed with the node registrar subsystem.
  • Other management functions are performed with the management subsystem using the node registrar subsystem.
  • Job scheduling functions are performed with the job scheduler subsystem using the node registrar subsystem.
  • FIG. 1 is a block diagram illustrating a high-performance computing (HPC) system suitable for implementing embodiments described herein.
  • HPC high-performance computing
  • FIG. 2 is a block diagram illustrating a computing device suitable for implementing aspects of the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 3 is a diagram illustrating the interaction between subsystems of the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 4 is a diagram illustrating a process interaction for a head node in the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 5 is a diagram illustrating the internal architecture of a node registrar according to one embodiment.
  • FIG. 6 is a flow diagram illustrating a method of managing nodes in a high-performance computing system according to one embodiment
  • HPC high-performance computing
  • FIG. 1 is a block diagram illustrating an HPC system suitable for implementing embodiments described herein.
  • the system 100 includes a client computer 102 capable of connecting to a HPC system through a network 104 .
  • the client computer 102 comprises, for example, a desktop, laptop, or mobile computing system.
  • the system 100 also includes an HPC system, such as the computing cluster 106 .
  • An HPC system according to one embodiment is any type of computing system that offers computational performance at least an order of magnitude greater than a desktop computing system.
  • HPC systems may include, but are not limited to, computing clusters, such as the computing cluster 106 , mainframe computing systems, supercomputers, or other types of high performance grid computing systems.
  • the HPC system utilized by the client computer 102 comprises the computing cluster 106 .
  • the computing cluster 106 includes a head node 108 and one or more compute nodes 110 A- 110 N (collectively referred to as nodes or compute nodes 110 ).
  • the head node 108 comprises a computing system responsible for performing tasks such as job management, cluster management, scheduling of tasks, and resource management for all of the compute nodes 110 A- 110 N in the computing cluster 106 .
  • the compute nodes 110 A- 110 N are computing systems that perform the actual computations.
  • the computing cluster 106 may have virtually any number of compute nodes 110 A- 110 N.
  • a node or a compute node according to one embodiment is an individually identifiable computer within an HPC system.
  • the network 104 may comprise any type of local area network or wide area network suitable for connecting the client computer 102 and the computing cluster 106 .
  • the network 104 comprises a high-speed local area network suitable for connecting the client computer 102 and the computing cluster 106 .
  • the network 104 may comprise a high-speed wide area network, such as the Internet, for connecting the client computer 102 and the computing cluster 106 over a greater geographical area.
  • the computing cluster 106 may also utilize various high-speed interconnects between the head node 108 and each of the compute nodes 110 A- 110 N.
  • FIG. 2 is a block diagram illustrating a computing device 200 suitable for implementing aspects of the high-performance computing system shown in FIG. 1 according to one embodiment.
  • computing device 200 may be used for one or more of client computer 102 , head node 108 , and compute nodes 110 A- 110 N.
  • the computing device 200 includes one or more processing units 212 and system memory 214 .
  • memory 214 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.), or some combination of the two.
  • Computing device 200 may also have additional features/functionality.
  • computing device 200 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape.
  • additional storage is illustrated in FIG. 2 by removable storage 216 and non-removable storage 218 .
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any suitable method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Memory 214 , removable storage 216 and non-removable storage 218 are all examples of computer storage media (e.g., computer-readable storage media storing computer-executable instructions that when executed by at least one processor cause the at least one processor to perform a method).
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by computing device 200 . Any such computer storage media may be part of computing device 200 .
  • computing device 200 The various elements of computing device 200 are communicatively coupled together via one or more communication links 215 .
  • Computing device 200 also includes one or more communication connections 224 that allow computing device 200 to communicate with other computers/applications 226 .
  • Computing device 200 may also include input device(s) 222 , such as keyboard, pointing device (e.g., mouse), pen, voice input device, touch input device, etc.
  • Computing device 200 may also include output device(s) 220 , such as a display, speakers, printer, etc.
  • FIGS. 1 and 2 and the above discussion are intended to provide a brief general description of a suitable computing environment in which one or more embodiments may be implemented, and is not intended to suggest any limitation as to the scope of use or functionality of the embodiments.
  • FIG. 3 is a diagram illustrating the interaction between subsystems of the HPC system 100 shown in FIG. 1 according to one embodiment.
  • HPC system 100 includes a node registrar subsystem 302 , a management subsystem 304 , and a job scheduler subsystem 306 .
  • subsystems 302 , 304 , and 306 are implemented on head node 108 ( FIG. 1 ).
  • Management subsystem 304 communicates with node registrar subsystem 302 to create nodes and update node properties as indicated by link 303 .
  • Scheduler subsystem 306 communicates with node registrar subsystem 302 to update node properties, as indicated by link 307 .
  • Management subsystem 304 and scheduler subsystem 306 further communicate with node registrar subsystem 302 to enumerate groups of nodes, query nodes by property, query nodes by group, and get node properties, as indicated by link 305 .
  • Node registrar subsystem 302 performs some management functions.
  • node registrar subsystem 302 performs logical node management (e.g., adding nodes, removing nodes, grouping nodes, and handling state transitions of nodes).
  • Management subsystem 304 handles: (1) Node deployment (e.g., getting an operating system and HPC Pack running on an actual node); (2) node configuration management (e.g., altering system configuration of a node after initial installation, and then on an ongoing basis); (3) infrastructure configuration management (e.g., altering configuration of network services after cluster setup, and then on an ongoing basis); and (4) node monitoring (e.g., live heat-map and performance charts).
  • Node registrar subsystem 302 is implemented as a service and a database, and acts as a central repository for information about all nodes within the HPC system 100 (including, for example, head nodes, compute nodes, broker nodes, workstation nodes, Azure worker nodes, and Azure virtual machine nodes).
  • the node registrar subsystem 302 formalizes data sharing between the HPC subsystems (e.g., between subsystems 302 , 304 , and 306 ), and allows interaction with heterogeneous subsystems: different types of management, job scheduler, and monitoring solutions.
  • the node registrar subsystem 302 also facilitates scale-out of both management infrastructure and the job scheduler by delegating responsibility of different nodes to different sub-system instances, and allows different types of management and job scheduler implementations to run side-by-side.
  • the node registrar subsystem 302 maintains information that has common relevance across all HPC node types. In one embodiment, this includes node identifiers (such as name and SID), as well as HPC-logical information (such as type, state, and group membership). The node registrar subsystem 302 additionally maintains resource information about the nodes (e.g., information that job scheduler subsystem 306 uses to make scheduling decisions).
  • the node registrar subsystem 302 efficiently drives the node list (both from a graphical user interface (GUI) and Powershell) and acts as an authoritative list of nodes for other components within the HPC system 100 .
  • node registrar subsystem 302 also performs workflows associated with logical changes to the HPC node data, such as adding and removing nodes, updating common node properties, and changing node state.
  • node registrar subsystem 302 includes the following: (1) The node registrar interfaces are versioned; (2) treatment of shared data between the HPC management 304 and job scheduler 306 components is streamlined through the node registrar 302 ; (3) HPC management 304 and job scheduler 306 components are explicitly dependent on the node registrar 302 (and not each other); (4) the node registrar 302 supports nodes running with no management component; (5) the node registrar service is stateless and can scale-out to meet high availability requirements; (6) the node registrar 302 is integrated with a granular permissions system; (7) the node registrar 302 supports multiple authentication modes; (8) the node registrar 302 can run in Azure, using a SQL Azure store; and (9) the node registrar 302 supports client concurrency, executing both read and write operations against the store.
  • FIG. 4 is a diagram illustrating a process interaction for head node 108 in the HPC system 100 shown in FIG. 1 according to one embodiment.
  • HpcSdm service 404 and HpcManagement service 406 correspond to management subsystem 304 ( FIG. 3 ), and provide configuration management of the head node 108 ( FIG. 1 ), as well as manage deployment of compute nodes 110 A- 110 N.
  • HpcNodeRegistrar service 408 corresponds to node registrar subsystem 302 ( FIG. 3 ), and maintains a mapping between nodes and their management and scheduler owners, which facilitates heterogeneous node management solutions, as well as head-node scale-out.
  • HpcScheduler service 410 corresponds to scheduler subsystem 306 ( FIG.
  • a relational database server 402 (which is a SQL server in the illustrated embodiment) stores a node registrar database 403 .
  • the node registrar database 403 also corresponds to the node registrar subsystem 302 shown in FIG. 3 .
  • the node registrar subsystem 302 ( FIG. 3 ) includes a stateless HpcNodeRegistrar service 408 as well as a database 403 for storing node state information.
  • an HPC system may include multiple instances of the HpcNodeRegistrar service 408 running on multiple nodes, and all of the instances access the database 403 to manage state information.
  • each head node 108 ( FIG. 1 ) in each cluster of compute nodes 110 A- 110 N of a given HPC system runs a copy of the HpcNodeRegistrar service 408 .
  • Clients of the node registrar subsystem 302 such as client computer 102 ( FIG.
  • management subsystem 304 and scheduler subsystem 306 are also clients of the node registrar subsystem 302 .
  • scheduler subsystem 306 may access HpcNodeRegistrar service 408 to determine the state of a particular node and determine based on that state whether to schedule work on that node.
  • services 404 , 406 , 408 , and 410 communicate with each other, as well as with server 402 and client 412 , as represented by links 405 , 407 , 409 , 411 , 413 , 415 , 417 , 419 , and 421 .
  • client 412 provides rich node information and deployment operations to HpcSdm service 404 , as indicated by link 417 .
  • client 412 provides basic node information, logical node operations, and node group operations to HpcNodeRegistrar 408 , as indicated by link 419 .
  • Client 412 provides job information and operations to HpcScheduler 410 , as indicated by link 421 .
  • FIG. 5 is a diagram illustrating the internal architecture of a node registrar service 408 according to one embodiment.
  • Service 408 according to one embodiment is entirely stateless, and handles high-availability through scale-out of multiple services (active-active) rather than relying on failover.
  • Service 408 includes operational logging unit 508 , tracing unit 510 , and permission manager unit 512 .
  • Application programming interface (API) 506 acts as one large monolithic interface presented to all outside components over a single Windows Communication Foundation (WCF) channel 507 .
  • WCF Windows Communication Foundation
  • the same interface 506 applies whether the caller is a user interface, arbitrary user code, or an HPC service.
  • the API 506 can be carved into public and private components as necessary.
  • the API 506 is exposed as a WCF endpoint by each instance of the node registrar service 408 , and provides all of the external functionality of the node registrar.
  • the permission manager unit 512 performs authentication and permission-validation for the diverse set of callers.
  • the tracing unit 510 performs eventing and tracing functions (e.g., using an event trace log (ETL)).
  • Operational logging unit 508 logs user operations to the database 403 based on information received through a .NET Remoting link 505 .
  • the data access layer (DAL) 502 is a software system component that directly interacts with the server 402 , as indicated by link 503 .
  • node registrar service 408 can run in active-active configuration against the same database 403 to facilitate high availability. Additionally, each individual node registrar service 408 is running with multiple threads in one embodiment, and there is not any locking in the DAL 502 to prevent simultaneous requests to the database 403 .
  • SQL server 402 stores node registrar database 403 , which includes a plurality of tables.
  • the tables in database 403 include a Node table, a NodeProperty table, a NetworkInterface table, a Service table, a NodeGroup table, a GroupMembership table, and a GlobalSettings table. These tables are described in further detail below.
  • the Node table is the central table of the node registrar 302 .
  • each row in the Node table corresponds to a node in the HPC installation.
  • Node properties that are columns in this table are first-class properties that may be used in filters. All nodes are versioned in one embodiment, such that if semantic changes are made to a node type and it is desired to exclude it in future versions, the system provides that flexibility.
  • the NodeProperty table contains arbitrary id/value pairs associated with particular nodes. These values represent second-class node properties.
  • the id column is indexed for reasonably fast lookups. If a node is deleted, the associated properties are cascade deleted.
  • the NetworkInterface table stores network interface information for nodes. Each node can have a multiple NICs with different MAC addresses.
  • the Service table contains management and job scheduler components associated with this node registrar. This data serves a few purposes: (1) When a management or scheduler component calls into the node registrar 302 , its view of the nodes can be easily scoped to nodes it cares about; (2) the GUI can query the Service table for a list of operation log providers; (3) management and scheduler URIs are associated with each node, allowing the client to find the proper component for data and scenarios that exist outside the node registrar scope.
  • the NodeGroup table contains a list of HPC Node Groups.
  • the GroupMembership table provides group membership information for nodes. Each row in this table defines the relationship of a specific node to a specific group. If either the node or node group are deleted, the group membership is cascade deleted.
  • the GlobalSettings table stores various configuration properties that are common across all active node registrars.
  • FIG. 6 is a flow diagram illustrating a method 600 of managing nodes in a high-performance computing (HPC) system 100 , which includes a management subsystem 304 and a job scheduler subsystem 306 , according to one embodiment.
  • HPC high-performance computing
  • a node registrar subsystem 302 is provided.
  • logical node management functions are performed with the node registrar subsystem.
  • other management functions are performed with the management subsystem using the node registrar subsystem.
  • job scheduling functions are performed with the job scheduler subsystem using the node registrar subsystem.
  • the management subsystem 304 and the job scheduler subsystem 306 in method 600 are each a client of the node registrar subsystem 302 .
  • the node registrar subsystem 302 in method 600 according to one embodiment comprises a stateless node registrar service 408 and a database 403 for storing node information for the nodes in the HPC system 100 .
  • the management subsystem 304 is configured to access the stored node information, update node properties in the database 403 , and query the nodes by property and by group, using the node registrar service 408 .
  • the job scheduler subsystem 306 is configured to access the stored node information, update node properties in the database 403 , and query the nodes by property and by group, using the node registrar service 408 .
  • the database 403 of the node registrar subsystem 302 in method 600 includes a node table, with each row in the node table corresponding to one of the nodes in the HPC system 100 , and each column listing properties of the nodes in the HPC system 100 .
  • the logical node management functions performed by the node registrar subsystem 302 in method 600 include adding nodes, removing nodes, updating node properties, handling state transitions of nodes, and grouping nodes.
  • the other management functions performed with the management subsystem in method 600 include node deployment, node configuration management, infrastructure configuration management, and node monitoring.

Abstract

A method of managing nodes in a high-performance computing (HPC) system, which includes a management subsystem and a job scheduler subsystem, includes providing a node registrar subsystem. Logical node management functions are performed with the node registrar subsystem. Other management functions are performed with the management subsystem using the node registrar subsystem. Job scheduling functions are performed with the job scheduler subsystem using the node registrar subsystem.

Description

    BACKGROUND
  • High-performance computing (HPC) or cluster computing is increasingly used for a large number of computationally intense tasks, such as webscale data mining, machine learning, network traffic analysis, and various engineering and scientific tasks. In such systems, jobs may be scheduled to execute concurrently on a computing cluster in which application data is stored on multiple compute nodes.
  • Previous implementations of HPC clusters have maintained multiple node databases, between management and scheduler subsystems (with one-to-one mapping between the node-entries in each subsystem). This can lead to several problems, including the following: (1) Interaction between subsystems is informal and fragile; (2) scalability of a cluster is limited to the least scalable subsystem (for example, a system management subsystem may struggle if there are more than 1000 nodes); and (3) different types of HPC nodes may require different types of management and scheduling solutions.
  • SUMMARY
  • This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • A node registrar subsystem is disclosed that, according to one embodiment, is implemented as a service and a database, and acts as a central repository for information about all nodes within an HPC system. The node registrar subsystem formalizes data sharing between the HPC subsystems, and allows interaction with heterogeneous subsystems: different types of management, job scheduler, and monitoring solutions. The node registrar subsystem also facilitates scale-out of both management infrastructure and the job scheduler by delegating responsibility of different nodes to different sub-system instances.
  • One embodiment is directed to a method of managing nodes in a high-performance computing (HPC) system, which includes a management subsystem and a job scheduler subsystem. The method includes providing a node registrar subsystem. Logical node management functions are performed with the node registrar subsystem. Other management functions are performed with the management subsystem using the node registrar subsystem. Job scheduling functions are performed with the job scheduler subsystem using the node registrar subsystem.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings are included to provide a further understanding of embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain principles of embodiments. Other embodiments and many of the intended advantages of embodiments will be readily appreciated, as they become better understood by reference to the following detailed description. The elements of the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding similar parts.
  • FIG. 1 is a block diagram illustrating a high-performance computing (HPC) system suitable for implementing embodiments described herein.
  • FIG. 2 is a block diagram illustrating a computing device suitable for implementing aspects of the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 3 is a diagram illustrating the interaction between subsystems of the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 4 is a diagram illustrating a process interaction for a head node in the high-performance computing system shown in FIG. 1 according to one embodiment.
  • FIG. 5 is a diagram illustrating the internal architecture of a node registrar according to one embodiment.
  • FIG. 6 is a flow diagram illustrating a method of managing nodes in a high-performance computing system according to one embodiment
  • DETAILED DESCRIPTION
  • In the following Detailed Description, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
  • It is to be understood that features of the various exemplary embodiments described herein may be combined with each other, unless specifically noted otherwise.
  • The following detailed description is directed to technologies for implementing a node registrar as a central repository for information about all nodes in a high-performance computing (HPC) system. While the subject matter described herein is presented in the general context of program modules that execute in conjunction with the execution of an operating system and application programs on a computer system, those skilled in the art will recognize that other implementations may be performed in combination with other types of program modules. Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the subject matter described herein may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like.
  • FIG. 1 is a block diagram illustrating an HPC system suitable for implementing embodiments described herein. The system 100 includes a client computer 102 capable of connecting to a HPC system through a network 104. The client computer 102 comprises, for example, a desktop, laptop, or mobile computing system. The system 100 also includes an HPC system, such as the computing cluster 106. An HPC system according to one embodiment is any type of computing system that offers computational performance at least an order of magnitude greater than a desktop computing system. For instance, HPC systems may include, but are not limited to, computing clusters, such as the computing cluster 106, mainframe computing systems, supercomputers, or other types of high performance grid computing systems.
  • In the embodiments presented herein, the HPC system utilized by the client computer 102 comprises the computing cluster 106. The computing cluster 106 includes a head node 108 and one or more compute nodes 110A-110N (collectively referred to as nodes or compute nodes 110). The head node 108 comprises a computing system responsible for performing tasks such as job management, cluster management, scheduling of tasks, and resource management for all of the compute nodes 110A-110N in the computing cluster 106. The compute nodes 110A-110N are computing systems that perform the actual computations. The computing cluster 106 may have virtually any number of compute nodes 110A-110N. A node or a compute node according to one embodiment is an individually identifiable computer within an HPC system.
  • It should be appreciated that the network 104 may comprise any type of local area network or wide area network suitable for connecting the client computer 102 and the computing cluster 106. For instance, in one embodiment, the network 104 comprises a high-speed local area network suitable for connecting the client computer 102 and the computing cluster 106. In other embodiments, however, the network 104 may comprise a high-speed wide area network, such as the Internet, for connecting the client computer 102 and the computing cluster 106 over a greater geographical area. It should also be appreciated that the computing cluster 106 may also utilize various high-speed interconnects between the head node 108 and each of the compute nodes 110A-110N.
  • FIG. 2 is a block diagram illustrating a computing device 200 suitable for implementing aspects of the high-performance computing system shown in FIG. 1 according to one embodiment. For example, computing device 200 may be used for one or more of client computer 102, head node 108, and compute nodes 110A-110N. In the illustrated embodiment, the computing device 200 includes one or more processing units 212 and system memory 214. Depending on the exact configuration and type of computing device, memory 214 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.), or some combination of the two.
  • Computing device 200 may also have additional features/functionality. For example, computing device 200 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in FIG. 2 by removable storage 216 and non-removable storage 218. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any suitable method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Memory 214, removable storage 216 and non-removable storage 218 are all examples of computer storage media (e.g., computer-readable storage media storing computer-executable instructions that when executed by at least one processor cause the at least one processor to perform a method). Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by computing device 200. Any such computer storage media may be part of computing device 200.
  • The various elements of computing device 200 are communicatively coupled together via one or more communication links 215. Computing device 200 also includes one or more communication connections 224 that allow computing device 200 to communicate with other computers/applications 226. Computing device 200 may also include input device(s) 222, such as keyboard, pointing device (e.g., mouse), pen, voice input device, touch input device, etc. Computing device 200 may also include output device(s) 220, such as a display, speakers, printer, etc.
  • FIGS. 1 and 2 and the above discussion are intended to provide a brief general description of a suitable computing environment in which one or more embodiments may be implemented, and is not intended to suggest any limitation as to the scope of use or functionality of the embodiments.
  • FIG. 3 is a diagram illustrating the interaction between subsystems of the HPC system 100 shown in FIG. 1 according to one embodiment. As shown in FIG. 3, HPC system 100 includes a node registrar subsystem 302, a management subsystem 304, and a job scheduler subsystem 306. In one embodiment, subsystems 302, 304, and 306 are implemented on head node 108 (FIG. 1). Management subsystem 304 communicates with node registrar subsystem 302 to create nodes and update node properties as indicated by link 303. Scheduler subsystem 306 communicates with node registrar subsystem 302 to update node properties, as indicated by link 307. Management subsystem 304 and scheduler subsystem 306 further communicate with node registrar subsystem 302 to enumerate groups of nodes, query nodes by property, query nodes by group, and get node properties, as indicated by link 305.
  • Node registrar subsystem 302 according to one embodiment performs some management functions. In one embodiment, node registrar subsystem 302 performs logical node management (e.g., adding nodes, removing nodes, grouping nodes, and handling state transitions of nodes). Management subsystem 304 according to one embodiment handles: (1) Node deployment (e.g., getting an operating system and HPC Pack running on an actual node); (2) node configuration management (e.g., altering system configuration of a node after initial installation, and then on an ongoing basis); (3) infrastructure configuration management (e.g., altering configuration of network services after cluster setup, and then on an ongoing basis); and (4) node monitoring (e.g., live heat-map and performance charts).
  • Node registrar subsystem 302 according to one embodiment is implemented as a service and a database, and acts as a central repository for information about all nodes within the HPC system 100 (including, for example, head nodes, compute nodes, broker nodes, workstation nodes, Azure worker nodes, and Azure virtual machine nodes). The node registrar subsystem 302 formalizes data sharing between the HPC subsystems (e.g., between subsystems 302, 304, and 306), and allows interaction with heterogeneous subsystems: different types of management, job scheduler, and monitoring solutions. The node registrar subsystem 302 also facilitates scale-out of both management infrastructure and the job scheduler by delegating responsibility of different nodes to different sub-system instances, and allows different types of management and job scheduler implementations to run side-by-side.
  • The node registrar subsystem 302 according to one embodiment maintains information that has common relevance across all HPC node types. In one embodiment, this includes node identifiers (such as name and SID), as well as HPC-logical information (such as type, state, and group membership). The node registrar subsystem 302 additionally maintains resource information about the nodes (e.g., information that job scheduler subsystem 306 uses to make scheduling decisions).
  • Practically, the node registrar subsystem 302 according to one embodiment efficiently drives the node list (both from a graphical user interface (GUI) and Powershell) and acts as an authoritative list of nodes for other components within the HPC system 100. In one embodiment, node registrar subsystem 302 also performs workflows associated with logical changes to the HPC node data, such as adding and removing nodes, updating common node properties, and changing node state.
  • Additional features and advantages of the node registrar subsystem 302 according to one embodiment include the following: (1) The node registrar interfaces are versioned; (2) treatment of shared data between the HPC management 304 and job scheduler 306 components is streamlined through the node registrar 302; (3) HPC management 304 and job scheduler 306 components are explicitly dependent on the node registrar 302 (and not each other); (4) the node registrar 302 supports nodes running with no management component; (5) the node registrar service is stateless and can scale-out to meet high availability requirements; (6) the node registrar 302 is integrated with a granular permissions system; (7) the node registrar 302 supports multiple authentication modes; (8) the node registrar 302 can run in Azure, using a SQL Azure store; and (9) the node registrar 302 supports client concurrency, executing both read and write operations against the store.
  • FIG. 4 is a diagram illustrating a process interaction for head node 108 in the HPC system 100 shown in FIG. 1 according to one embodiment. HpcSdm service 404 and HpcManagement service 406 correspond to management subsystem 304 (FIG. 3), and provide configuration management of the head node 108 (FIG. 1), as well as manage deployment of compute nodes 110A-110N. HpcNodeRegistrar service 408 corresponds to node registrar subsystem 302 (FIG. 3), and maintains a mapping between nodes and their management and scheduler owners, which facilitates heterogeneous node management solutions, as well as head-node scale-out. HpcScheduler service 410 corresponds to scheduler subsystem 306 (FIG. 3), and schedules jobs to be performed by compute nodes 110A-110N. In one embodiment, there can be more than one HpcScheduler 410 per HpcNodeRegistrar 408, and likewise there can be more than one management component per HpcNodeRegistrar 408. A relational database server 402 (which is a SQL server in the illustrated embodiment) stores a node registrar database 403. The node registrar database 403 also corresponds to the node registrar subsystem 302 shown in FIG. 3.
  • The node registrar subsystem 302 (FIG. 3) according to one embodiment includes a stateless HpcNodeRegistrar service 408 as well as a database 403 for storing node state information. In one embodiment, an HPC system may include multiple instances of the HpcNodeRegistrar service 408 running on multiple nodes, and all of the instances access the database 403 to manage state information. In one aspect of this embodiment, each head node 108 (FIG. 1) in each cluster of compute nodes 110A-110N of a given HPC system runs a copy of the HpcNodeRegistrar service 408. Clients of the node registrar subsystem 302, such as client computer 102 (FIG. 1), have a list of all of the head nodes 108, and make round-robin connection attempts to the head nodes 108 to access the service 408. In one embodiment, management subsystem 304 and scheduler subsystem 306 (FIG. 3) are also clients of the node registrar subsystem 302. For example, scheduler subsystem 306 may access HpcNodeRegistrar service 408 to determine the state of a particular node and determine based on that state whether to schedule work on that node.
  • As shown in FIG. 4, services 404, 406, 408, and 410, communicate with each other, as well as with server 402 and client 412, as represented by links 405, 407, 409, 411, 413, 415, 417, 419, and 421. Specifically, client 412 provides rich node information and deployment operations to HpcSdm service 404, as indicated by link 417. Client 412 provides basic node information, logical node operations, and node group operations to HpcNodeRegistrar 408, as indicated by link 419. Client 412 provides job information and operations to HpcScheduler 410, as indicated by link 421.
  • FIG. 5 is a diagram illustrating the internal architecture of a node registrar service 408 according to one embodiment. Service 408 according to one embodiment is entirely stateless, and handles high-availability through scale-out of multiple services (active-active) rather than relying on failover. Service 408 includes operational logging unit 508, tracing unit 510, and permission manager unit 512. Application programming interface (API) 506 acts as one large monolithic interface presented to all outside components over a single Windows Communication Foundation (WCF) channel 507. In one embodiment, the same interface 506 applies whether the caller is a user interface, arbitrary user code, or an HPC service. In other embodiments, the API 506 can be carved into public and private components as necessary. The API 506 is exposed as a WCF endpoint by each instance of the node registrar service 408, and provides all of the external functionality of the node registrar. The permission manager unit 512 performs authentication and permission-validation for the diverse set of callers. The tracing unit 510 performs eventing and tracing functions (e.g., using an event trace log (ETL)). Operational logging unit 508 logs user operations to the database 403 based on information received through a .NET Remoting link 505. The data access layer (DAL) 502 is a software system component that directly interacts with the server 402, as indicated by link 503.
  • Multiple instances of the node registrar service 408 can run in active-active configuration against the same database 403 to facilitate high availability. Additionally, each individual node registrar service 408 is running with multiple threads in one embodiment, and there is not any locking in the DAL 502 to prevent simultaneous requests to the database 403.
  • SQL server 402 stores node registrar database 403, which includes a plurality of tables. The tables in database 403 according to one embodiment include a Node table, a NodeProperty table, a NetworkInterface table, a Service table, a NodeGroup table, a GroupMembership table, and a GlobalSettings table. These tables are described in further detail below.
  • The Node table is the central table of the node registrar 302. In one embodiment, each row in the Node table corresponds to a node in the HPC installation. Node properties that are columns in this table are first-class properties that may be used in filters. All nodes are versioned in one embodiment, such that if semantic changes are made to a node type and it is desired to exclude it in future versions, the system provides that flexibility.
  • The NodeProperty table contains arbitrary id/value pairs associated with particular nodes. These values represent second-class node properties. The id column is indexed for reasonably fast lookups. If a node is deleted, the associated properties are cascade deleted.
  • The NetworkInterface table stores network interface information for nodes. Each node can have a multiple NICs with different MAC addresses.
  • The Service table contains management and job scheduler components associated with this node registrar. This data serves a few purposes: (1) When a management or scheduler component calls into the node registrar 302, its view of the nodes can be easily scoped to nodes it cares about; (2) the GUI can query the Service table for a list of operation log providers; (3) management and scheduler URIs are associated with each node, allowing the client to find the proper component for data and scenarios that exist outside the node registrar scope.
  • The NodeGroup table contains a list of HPC Node Groups.
  • The GroupMembership table provides group membership information for nodes. Each row in this table defines the relationship of a specific node to a specific group. If either the node or node group are deleted, the group membership is cascade deleted.
  • The GlobalSettings table stores various configuration properties that are common across all active node registrars.
  • FIG. 6 is a flow diagram illustrating a method 600 of managing nodes in a high-performance computing (HPC) system 100, which includes a management subsystem 304 and a job scheduler subsystem 306, according to one embodiment. At 602 in method 600, a node registrar subsystem 302 is provided. At 604, logical node management functions are performed with the node registrar subsystem. At 606, other management functions are performed with the management subsystem using the node registrar subsystem. At 608, job scheduling functions are performed with the job scheduler subsystem using the node registrar subsystem.
  • In one embodiment, the management subsystem 304 and the job scheduler subsystem 306 in method 600 are each a client of the node registrar subsystem 302. The node registrar subsystem 302 in method 600 according to one embodiment comprises a stateless node registrar service 408 and a database 403 for storing node information for the nodes in the HPC system 100. In one embodiment of method 600, the management subsystem 304 is configured to access the stored node information, update node properties in the database 403, and query the nodes by property and by group, using the node registrar service 408. In one embodiment of method 600, the job scheduler subsystem 306 is configured to access the stored node information, update node properties in the database 403, and query the nodes by property and by group, using the node registrar service 408. The database 403 of the node registrar subsystem 302 in method 600 according to one embodiment includes a node table, with each row in the node table corresponding to one of the nodes in the HPC system 100, and each column listing properties of the nodes in the HPC system 100. The logical node management functions performed by the node registrar subsystem 302 in method 600 according to one embodiment include adding nodes, removing nodes, updating node properties, handling state transitions of nodes, and grouping nodes. The other management functions performed with the management subsystem in method 600 according to one embodiment include node deployment, node configuration management, infrastructure configuration management, and node monitoring.
  • Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific embodiments shown and described without departing from the scope of the present invention. This application is intended to cover any adaptations or variations of the specific embodiments discussed herein. Therefore, it is intended that this invention be limited only by the claims and the equivalents thereof.

Claims (20)

1. A method of managing nodes in a high-performance computing (HPC) system, which includes a management subsystem and a job scheduler subsystem, the method comprising:
providing a node registrar subsystem;
performing logical node management functions with the node registrar subsystem;
performing other management functions with the management subsystem using the node registrar subsystem; and
performing job scheduling functions with the job scheduler subsystem using the node registrar subsystem.
2. The method of claim 1, wherein the management subsystem and the job scheduler subsystem are each a client of the node registrar subsystem.
3. The method of claim 1, wherein the node registrar subsystem comprises a stateless node registrar service and a database for storing node information for the nodes in the HPC system.
4. The method of claim 3, and further comprising:
accessing the stored node information with the management subsystem using the node registrar service.
5. The method of claim 3, and further comprising:
updating node properties in the database with the management subsystem using the node registrar service.
6. The method of claim 3, and further comprising:
query the nodes by property with the management subsystem using the node registrar service.
7. The method of claim 3, and further comprising:
query the nodes by group with the management subsystem using the node registrar service.
8. The method of claim 3, and further comprising:
accessing the stored node information with the job scheduler subsystem using the node registrar service.
9. The method of claim 3, and further comprising:
updating node properties in the database with the job scheduler subsystem using the node registrar service.
10. The method of claim 3, and further comprising:
query the nodes by property with the job scheduler subsystem using the node registrar service.
11. The method of claim 3, and further comprising:
query the nodes by group with the job scheduler subsystem using the node registrar service.
12. The method of claim 3, wherein the database includes a node table with each row in the node table corresponding to one of the nodes in the HPC system, and each column listing properties of the nodes in the HPC system.
13. The method of claim 1, wherein the logical node management functions performed by the node registrar subsystem include adding nodes and removing nodes.
14. The method of claim 13, wherein the logical node management functions performed by the node registrar subsystem further include updating node properties and handling state transitions of nodes.
15. The method of claim 14, wherein the logical node management functions performed by the node registrar subsystem further include grouping nodes.
16. The method of claim 13, and further comprising:
performing node deployment, node configuration management, infrastructure configuration management, and node monitoring with the management subsystem.
17. A computer-readable storage medium storing computer-executable instructions that when executed by at least one processor cause the at least one processor to perform a method of managing nodes in a high-performance computing (HPC) system, wherein the HPC system includes a management subsystem and a job scheduler subsystem, the method comprising:
performing logical node management functions with a node registrar subsystem;
performing other management functions with the management subsystem using the node registrar subsystem; and
performing job scheduling functions with the job scheduler subsystem using the node registrar subsystem.
18. The computer-readable storage medium of claim 17, wherein the node registrar subsystem comprises a stateless node registrar service and a database for storing node information for the nodes in the HPC system, and wherein the management subsystem and job scheduler subsystem are configured to access the database using the node registrar service.
19. The computer-readable storage medium of claim 17, wherein the logical node management functions performed by the node registrar subsystem include adding nodes, removing nodes, updating node properties, and handling state transitions of nodes, and wherein the other management functions performed by the management subsystem include node deployment, node configuration management, infrastructure configuration management, and node monitoring.
20. A method of managing nodes in a high-performance computing (HPC) system, which includes a management subsystem and a job scheduler subsystem, the method comprising:
providing a node registrar subsystem, wherein the node registrar subsystem comprises a stateless node registrar service and a database for storing node information for the nodes in the HPC system;
performing logical node management functions with the node registrar subsystem;
performing other management functions with the management subsystem using the node registrar service to access the database; and
performing job scheduling functions with the job scheduler subsystem using the node registrar service to access the database.
US13/162,130 2011-06-16 2011-06-16 Managing nodes in a high-performance computing system using a node registrar Abandoned US20120324456A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US13/162,130 US20120324456A1 (en) 2011-06-16 2011-06-16 Managing nodes in a high-performance computing system using a node registrar
US14/741,807 US9747130B2 (en) 2011-06-16 2015-06-17 Managing nodes in a high-performance computing system using a node registrar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/162,130 US20120324456A1 (en) 2011-06-16 2011-06-16 Managing nodes in a high-performance computing system using a node registrar

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/741,807 Continuation US9747130B2 (en) 2011-06-16 2015-06-17 Managing nodes in a high-performance computing system using a node registrar

Publications (1)

Publication Number Publication Date
US20120324456A1 true US20120324456A1 (en) 2012-12-20

Family

ID=47354825

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/162,130 Abandoned US20120324456A1 (en) 2011-06-16 2011-06-16 Managing nodes in a high-performance computing system using a node registrar
US14/741,807 Active US9747130B2 (en) 2011-06-16 2015-06-17 Managing nodes in a high-performance computing system using a node registrar

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/741,807 Active US9747130B2 (en) 2011-06-16 2015-06-17 Managing nodes in a high-performance computing system using a node registrar

Country Status (1)

Country Link
US (2) US20120324456A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130304849A1 (en) * 2012-05-14 2013-11-14 Sap Ag Distribution of messages in system landscapes
US9645873B2 (en) 2013-06-03 2017-05-09 Red Hat, Inc. Integrated configuration management and monitoring for computer systems
CN106713024A (en) * 2016-12-14 2017-05-24 郑州云海信息技术有限公司 Batch cluster node management method and system and computer cluster management node
CN110737489A (en) * 2019-10-08 2020-01-31 成都中讯创新科技股份有限公司 intelligent high-performance computing centers
CN113296972A (en) * 2020-07-20 2021-08-24 阿里巴巴集团控股有限公司 Information registration method, computing device and storage medium
US20210349909A1 (en) * 2020-05-08 2021-11-11 Worthy Technology LLC System and methods for creating, distributing, analyzing and optimizing data-driven signals

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809451B2 (en) * 2014-02-19 2023-11-07 Snowflake Inc. Caching systems and methods
CN108037930A (en) * 2017-12-25 2018-05-15 郑州云海信息技术有限公司 A kind of dispositions method, device and the equipment of Lustre file system
CN108536528A (en) * 2018-03-23 2018-09-14 湖南大学 Using the extensive network job scheduling method of perception

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5666486A (en) * 1995-06-23 1997-09-09 Data General Corporation Multiprocessor cluster membership manager framework
US5793962A (en) * 1996-04-30 1998-08-11 International Business Machines Corporation System for managing membership of a group of processors in a distributed computing environment
US6578068B1 (en) * 1999-08-31 2003-06-10 Accenture Llp Load balancer in environment services patterns
US6691244B1 (en) * 2000-03-14 2004-02-10 Sun Microsystems, Inc. System and method for comprehensive availability management in a high-availability computer system
US6928589B1 (en) * 2004-01-23 2005-08-09 Hewlett-Packard Development Company, L.P. Node management in high-availability cluster
US20050251567A1 (en) * 2004-04-15 2005-11-10 Raytheon Company System and method for cluster management based on HPC architecture
US20060064486A1 (en) * 2004-09-17 2006-03-23 Microsoft Corporation Methods for service monitoring and control
US7185076B1 (en) * 2000-05-31 2007-02-27 International Business Machines Corporation Method, system and program products for managing a clustered computing environment
US7185075B1 (en) * 1999-05-26 2007-02-27 Fujitsu Limited Element management system with dynamic database updates based on parsed snooping
US7188343B2 (en) * 2001-05-18 2007-03-06 Hewlett-Packard Development Company, L.P. Distributable multi-daemon configuration for multi-system management
US7266822B1 (en) * 2002-08-14 2007-09-04 Sun Microsystems, Inc. System and method for controlling and managing computer farms
US7366989B2 (en) * 1999-05-26 2008-04-29 Fujitsu Limited Element management system with data-driven interfacing driven by instantiation of meta-model
US20080307426A1 (en) * 2007-06-05 2008-12-11 Telefonaktiebolaget Lm Ericsson (Publ) Dynamic load management in high availability systems
US20090254917A1 (en) * 2008-04-02 2009-10-08 Atsuhisa Ohtani System and method for improved i/o node control in computer system
US20110125894A1 (en) * 2009-11-25 2011-05-26 Novell, Inc. System and method for intelligent workload management
US20110145383A1 (en) * 2000-02-28 2011-06-16 Bishop David A Enterprise management system
US20120042256A1 (en) * 2010-08-13 2012-02-16 International Business Machines Corporation High performance computing as a service
US8336040B2 (en) * 2004-04-15 2012-12-18 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US8724463B2 (en) * 2006-12-07 2014-05-13 Cisco Technology, Inc. Scalability of providing packet flow management

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5964837A (en) * 1995-06-28 1999-10-12 International Business Machines Corporation Computer network management using dynamic switching between event-driven and polling type of monitoring from manager station
US6385643B1 (en) * 1998-11-05 2002-05-07 Bea Systems, Inc. Clustered enterprise Java™ having a message passing kernel in a distributed processing system
WO2000060472A1 (en) * 1999-04-06 2000-10-12 Lipstream Networks, Inc. Facilitating real-time, multi-point communications over the internet
JP2001144759A (en) * 1999-11-12 2001-05-25 Fujitsu Ltd Communication network management system, sub communication network managing device used for the communication network management system, communication network managing device and computer readable recording medium with program recorded thereon
US7376693B2 (en) * 2002-02-08 2008-05-20 Jp Morgan Chase & Company System architecture for distributed computing and method of using the system
US7882504B2 (en) * 2004-01-29 2011-02-01 Klingman Edwin E Intelligent memory device with wakeup feature
US20050235055A1 (en) 2004-04-15 2005-10-20 Raytheon Company Graphical user interface for managing HPC clusters
US7711977B2 (en) 2004-04-15 2010-05-04 Raytheon Company System and method for detecting and managing HPC node failure
US7861246B2 (en) * 2004-06-17 2010-12-28 Platform Computing Corporation Job-centric scheduling in a grid environment
US7433931B2 (en) 2004-11-17 2008-10-07 Raytheon Company Scheduling in a high-performance computing (HPC) system
US20060198386A1 (en) * 2005-03-01 2006-09-07 Tong Liu System and method for distributed information handling system cluster active-active master node
US7590653B2 (en) * 2005-03-02 2009-09-15 Cassatt Corporation Automated discovery and inventory of nodes within an autonomic distributed computing system
US7853948B2 (en) * 2005-10-24 2010-12-14 International Business Machines Corporation Method and apparatus for scheduling grid jobs
US20080320482A1 (en) * 2007-06-20 2008-12-25 Dawson Christopher J Management of grid computing resources based on service level requirements
US8230070B2 (en) * 2007-11-09 2012-07-24 Manjrasoft Pty. Ltd. System and method for grid and cloud computing
US8495557B2 (en) * 2008-04-03 2013-07-23 Microsoft Corporation Highly available large scale network and internet systems
US8108466B2 (en) 2008-05-01 2012-01-31 Microsoft Corporation Automated offloading of user-defined functions to a high performance computing system
KR101495806B1 (en) * 2008-12-24 2015-02-26 삼성전자주식회사 Non-volatile memory device
US9600344B2 (en) 2009-01-21 2017-03-21 International Business Machines Corporation Proportional resizing of a logical partition based on a degree of performance difference between threads for high-performance computing on non-dedicated clusters
US8914511B1 (en) * 2009-06-26 2014-12-16 VMTurbo, Inc. Managing resources in virtualization systems
US8250213B2 (en) * 2009-11-16 2012-08-21 At&T Intellectual Property I, L.P. Methods and apparatus to allocate resources associated with a distributive computing network
US8615584B2 (en) * 2009-12-03 2013-12-24 International Business Machines Corporation Reserving services within a cloud computing environment
US8510749B2 (en) * 2010-05-27 2013-08-13 International Business Machines Corporation Framework for scheduling multicore processors
US8914805B2 (en) * 2010-08-31 2014-12-16 International Business Machines Corporation Rescheduling workload in a hybrid computing environment
US9658901B2 (en) * 2010-11-12 2017-05-23 Oracle International Corporation Event-based orchestration in distributed order orchestration system
US8453152B2 (en) * 2011-02-01 2013-05-28 International Business Machines Corporation Workflow control of reservations and regular jobs using a flexible job scheduler

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5666486A (en) * 1995-06-23 1997-09-09 Data General Corporation Multiprocessor cluster membership manager framework
US5793962A (en) * 1996-04-30 1998-08-11 International Business Machines Corporation System for managing membership of a group of processors in a distributed computing environment
US7366989B2 (en) * 1999-05-26 2008-04-29 Fujitsu Limited Element management system with data-driven interfacing driven by instantiation of meta-model
US7185075B1 (en) * 1999-05-26 2007-02-27 Fujitsu Limited Element management system with dynamic database updates based on parsed snooping
US6578068B1 (en) * 1999-08-31 2003-06-10 Accenture Llp Load balancer in environment services patterns
US20110145383A1 (en) * 2000-02-28 2011-06-16 Bishop David A Enterprise management system
US6691244B1 (en) * 2000-03-14 2004-02-10 Sun Microsystems, Inc. System and method for comprehensive availability management in a high-availability computer system
US7185076B1 (en) * 2000-05-31 2007-02-27 International Business Machines Corporation Method, system and program products for managing a clustered computing environment
US7188343B2 (en) * 2001-05-18 2007-03-06 Hewlett-Packard Development Company, L.P. Distributable multi-daemon configuration for multi-system management
US7266822B1 (en) * 2002-08-14 2007-09-04 Sun Microsystems, Inc. System and method for controlling and managing computer farms
US6928589B1 (en) * 2004-01-23 2005-08-09 Hewlett-Packard Development Company, L.P. Node management in high-availability cluster
US20050251567A1 (en) * 2004-04-15 2005-11-10 Raytheon Company System and method for cluster management based on HPC architecture
US8336040B2 (en) * 2004-04-15 2012-12-18 Raytheon Company System and method for topology-aware job scheduling and backfilling in an HPC environment
US20060064486A1 (en) * 2004-09-17 2006-03-23 Microsoft Corporation Methods for service monitoring and control
US8724463B2 (en) * 2006-12-07 2014-05-13 Cisco Technology, Inc. Scalability of providing packet flow management
US20080307426A1 (en) * 2007-06-05 2008-12-11 Telefonaktiebolaget Lm Ericsson (Publ) Dynamic load management in high availability systems
US20090254917A1 (en) * 2008-04-02 2009-10-08 Atsuhisa Ohtani System and method for improved i/o node control in computer system
US20110125894A1 (en) * 2009-11-25 2011-05-26 Novell, Inc. System and method for intelligent workload management
US20120042256A1 (en) * 2010-08-13 2012-02-16 International Business Machines Corporation High performance computing as a service

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130304849A1 (en) * 2012-05-14 2013-11-14 Sap Ag Distribution of messages in system landscapes
US9037678B2 (en) * 2012-05-14 2015-05-19 Sap Se Distribution of messages in system landscapes
US9645873B2 (en) 2013-06-03 2017-05-09 Red Hat, Inc. Integrated configuration management and monitoring for computer systems
CN106713024A (en) * 2016-12-14 2017-05-24 郑州云海信息技术有限公司 Batch cluster node management method and system and computer cluster management node
CN110737489A (en) * 2019-10-08 2020-01-31 成都中讯创新科技股份有限公司 intelligent high-performance computing centers
US20210349909A1 (en) * 2020-05-08 2021-11-11 Worthy Technology LLC System and methods for creating, distributing, analyzing and optimizing data-driven signals
US11586644B2 (en) * 2020-05-08 2023-02-21 Worthy Technology LLC System and methods for creating, distributing, analyzing and optimizing data-driven signals
CN113296972A (en) * 2020-07-20 2021-08-24 阿里巴巴集团控股有限公司 Information registration method, computing device and storage medium

Also Published As

Publication number Publication date
US20160004563A1 (en) 2016-01-07
US9747130B2 (en) 2017-08-29

Similar Documents

Publication Publication Date Title
US9747130B2 (en) Managing nodes in a high-performance computing system using a node registrar
US11461329B2 (en) Tracking query execution status for selectively routing queries
US10992740B2 (en) Dynamically balancing partitions within a distributed streaming storage platform
US10275281B2 (en) Scheduling jobs for processing log files using a database system
US9767040B2 (en) System and method for generating and storing real-time analytics metric data using an in memory buffer service consumer framework
US9576000B2 (en) Adaptive fragment assignment for processing file data in a database
US11271995B2 (en) Partition balancing in an on-demand services environment
US9449188B2 (en) Integration user for analytical access to read only data stores generated from transactional systems
US10725829B2 (en) Scheduling capacity in a data-processing cluster to an application queue by repurposing monitoring-based capacity of a delegator queue for job execution in the application queue
US20150142844A1 (en) Scalable objects for use in an on-demand services environment
US20130298202A1 (en) Computer implemented methods and apparatus for providing permissions to users in an on-demand service environment
US11314770B2 (en) Database multiplexing architectures
US11614967B2 (en) Distributed scheduling in a virtual machine environment
US11120049B2 (en) Concurrent data imports
US10944814B1 (en) Independent resource scheduling for distributed data processing programs
US11886551B2 (en) Systems and methods for asset management
US11755394B2 (en) Systems, methods, and apparatuses for tenant migration between instances in a cloud based computing environment
US20200233870A1 (en) Systems and methods for linking metric data to resources
US20180095664A1 (en) Techniques and architectures for efficient allocation of under-utilized resources
US20240106828A1 (en) System and method for implementing a cloud agnostic data mesh module
CN117312103B (en) Hot-pluggable distributed heterogeneous data source data scheduling processing system
US11061734B2 (en) Performing customized data compaction for efficient parallel data processing amongst a set of computing resources

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TEATHER, GREGORY WRAY;REEL/FRAME:026468/0761

Effective date: 20110614

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001

Effective date: 20141014

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE