[Pkg-ofed-commits] [infinipath-psm] 01/11: Reduce virtual memory usage in non-SCIF builds. PSM would allocate enough shared memory for 256 processes when compiled without SCIF support. This patch instead uses the number of processors online or the number of mpi processes running to determine how much memory to allocate. However, if MPI_LOCALNRANKS or PSC_MPI_PPN are not set, then the number of cores online will be used to set how much virtual memory should be allocated. In that case the maximum number of jobs that can run on a node will be limited to the number of cores online.
Ana Beatriz Guerrero López
ana at moszumanska.debian.org
Sun Apr 3 20:04:10 UTC 2016
This is an automated email from the git hooks/post-receive script.
ana pushed a commit to branch master
in repository infinipath-psm.
commit 4ae73250404dc6b593d1f4d0bbb3cd0708e72cf3
Author: Henry Estela <henry.r.estela at intel.com>
Date: Fri Dec 11 10:25:29 2015 -0700
Reduce virtual memory usage in non-SCIF builds.
PSM would allocate enough shared memory for 256 processes when compiled without
SCIF support. This patch instead uses the number of processors online or
the number of mpi processes running to determine how much memory to allocate.
However, if MPI_LOCALNRANKS or PSC_MPI_PPN are not set, then the
number of cores online will be used to set how much virtual memory
should be allocated. In that case the maximum number of jobs that can
run on a node will be limited to the number of cores online.
---
psm_context.c | 2 --
psm_context.h | 1 +
ptl_am/am_reqrep_shmem.c | 23 ++++++++++++++++++++++-
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/psm_context.c b/psm_context.c
index 2728fab..390b49a 100644
--- a/psm_context.c
+++ b/psm_context.c
@@ -43,7 +43,6 @@
#endif
#define PSMI_SHARED_CONTEXTS_ENABLED_BY_DEFAULT 1
-static int psmi_sharedcontext_params(int *nranks, int *rankid);
static int psmi_get_hca_selection_algorithm(void);
static psm_error_t psmi_init_userinfo_params(psm_ep_t ep,
int unit_id, int port,
@@ -593,7 +592,6 @@ fail:
return err;
}
-static
int
psmi_sharedcontext_params(int *nranks, int *rankid)
{
diff --git a/psm_context.h b/psm_context.h
index 72d382c..635bb10 100644
--- a/psm_context.h
+++ b/psm_context.h
@@ -67,6 +67,7 @@ psm_error_t psmi_context_check_status(const psmi_context_t *context);
psm_error_t psmi_context_interrupt_set(psmi_context_t *context, int enable);
int psmi_context_interrupt_isenabled(psmi_context_t *context);
+int psmi_sharedcontext_params(int *nranks, int *rankid);
/* Runtime flags describe what features are enabled in hw/sw and which
* corresponding PSM features are being used.
*
diff --git a/ptl_am/am_reqrep_shmem.c b/ptl_am/am_reqrep_shmem.c
index 8b7a962..ff8cea8 100644
--- a/ptl_am/am_reqrep_shmem.c
+++ b/ptl_am/am_reqrep_shmem.c
@@ -389,8 +389,20 @@ psmi_shm_attach(psm_ep_t ep, int *shmidx_o)
amsh_init_segment(), and physical memory is only allocated by the OS
accordingly. So, it looks like this is consumes a lot of memory,
but really it consumes as much as necessary for each active process. */
+#ifdef PSM_HAVE_SCIF
segsz = psmi_amsh_segsize(PTL_AMSH_MAX_LOCAL_PROCS,
PTL_AMSH_MAX_LOCAL_NODES);
+#else
+ /* In the non-SCIF case we should be able to get away with just allocating
+ * enough shm for the number of mpi ranks, if the number of ranks is
+ * unavailable, then we will fallback to the number of online cpu cores.
+ * This will help cut back on virtual memory usage.
+ */
+ int nranks, rankid, nprocs;
+ psmi_sharedcontext_params(&nranks, &rankid);
+ nprocs = (nranks <= 0) ? sysconf(_SC_NPROCESSORS_ONLN) : nranks;
+ segsz = psmi_amsh_segsize(nprocs, PTL_AMSH_MAX_LOCAL_NODES);
+#endif
ep->amsh_shmfd = shm_open(ep->amsh_keyname,
O_RDWR | O_CREAT | O_EXCL | O_TRUNC, S_IRWXU);
@@ -820,14 +832,23 @@ psmi_shm_detach(psm_ep_t ep)
pthread_mutex_unlock((pthread_mutex_t *) &(ep->amsh_dirpage->lock));
/* Instead of dynamically shrinking the shared memory region, we always
- leave it allocated for up to PTL_AMSH_MAX_LOCAL_PROCS processes.
+ leave it allocated for up to PTL_AMSH_MAX_LOCAL_PROCS or number
+ of processors online.
Thus mremap() is never necessary, nor is ftruncate() here.
However when the attached process count does go to 0, we should
fully munmap() the entire region.
*/
+#ifdef PSM_HAVE_SCIF
if (munmap((void *) ep->amsh_shmbase,
psmi_amsh_segsize(PTL_AMSH_MAX_LOCAL_PROCS,
PTL_AMSH_MAX_LOCAL_NODES))) {
+#else
+ int nranks, rankid, nprocs;
+ psmi_sharedcontext_params(&nranks, &rankid);
+ nprocs = (nranks <= 0) ? sysconf(_SC_NPROCESSORS_ONLN) : nranks;
+ if (munmap((void *) ep->amsh_shmbase,
+ psmi_amsh_segsize(nprocs, PTL_AMSH_MAX_LOCAL_NODES))) {
+#endif
err = psmi_handle_error(NULL, PSM_SHMEM_SEGMENT_ERR,
"Error with munamp of shared segment: %s", strerror(errno));
goto fail;
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/pkg-ofed/infinipath-psm.git
More information about the Pkg-ofed-commits
mailing list