[Pkg-ofed-commits] r294 - in branches/ofed-1.4.1upgrade/qperf/trunk: . debian src
Guy Coates
gmpc-guest at alioth.debian.org
Fri May 29 02:04:57 UTC 2009
Author: gmpc-guest
Date: 2009-05-29 14:04:56 +0000 (Fri, 29 May 2009)
New Revision: 294
Modified:
branches/ofed-1.4.1upgrade/qperf/trunk/AUTHORS
branches/ofed-1.4.1upgrade/qperf/trunk/configure.in
branches/ofed-1.4.1upgrade/qperf/trunk/debian/changelog
branches/ofed-1.4.1upgrade/qperf/trunk/debian/control
branches/ofed-1.4.1upgrade/qperf/trunk/qperf.spec
branches/ofed-1.4.1upgrade/qperf/trunk/src/help.txt
branches/ofed-1.4.1upgrade/qperf/trunk/src/mkhelp
branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.c
branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.h
branches/ofed-1.4.1upgrade/qperf/trunk/src/rdma.c
branches/ofed-1.4.1upgrade/qperf/trunk/src/rds.c
branches/ofed-1.4.1upgrade/qperf/trunk/src/socket.c
branches/ofed-1.4.1upgrade/qperf/trunk/src/support.c
Log:
Add OFED 1.4.1 release
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/AUTHORS
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/AUTHORS 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/AUTHORS 2009-05-29 14:04:56 UTC (rev 294)
@@ -4,3 +4,4 @@
Dotan Barak
Ralph Campbell
Yevgeny Kliteynik
+ Dave Olson
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/configure.in
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/configure.in 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/configure.in 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,5 +1,5 @@
-AC_INIT(qperf, 0.4.1, general at lists.openfabrics.org)
-AM_INIT_AUTOMAKE(qperf, 0.4.1)
+AC_INIT(qperf, 0.4.4, general at lists.openfabrics.org)
+AM_INIT_AUTOMAKE(qperf, 0.4.4)
AC_PROG_CC
AC_CHECK_LIB(ibverbs, ibv_open_device, RDMA=1)
AC_CHECK_LIB(rdmacm, rdma_create_id)
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/debian/changelog
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/debian/changelog 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/debian/changelog 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,3 +1,9 @@
+qperf (0.4.4-1) unstable; urgency=low
+
+ * OFED 1.4.1 upstream release
+
+ -- Guy Coates <gmpc at sanger.ac.uk> Fri, 29 May 2009 15:02:24 +0100
+
qperf (0.4.1-1) unstable; urgency=low
* Backport ofed-1.4.1-rc1 release to ofed.1.4
* Initial release (Closes: #nnnn) <nnnn is the bug number of your ITP>
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/debian/control
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/debian/control 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/debian/control 2009-05-29 14:04:56 UTC (rev 294)
@@ -2,13 +2,13 @@
Section: net
Priority: extra
Maintainer: Guy Coates <gmpc at sanger.ac.uk>
-Build-Depends: debhelper (>= 7), autotools-dev, libibverbs-dev, librdmacm-dev
+Build-Depends: debhelper (>= 7), autotools-dev, libibverbs-dev (>=1.1.2-1+OFED), librdmacm-dev (>=1.0.8-1+OFED)
Standards-Version: 3.8.0
Homepage: http://www.openfabrics.org
Package: qperf
Architecture: any
-Depends: ${shlibs:Depends}, ${misc:Depends}
+Depends: ${shlibs:Depends}, ${misc:Depends}, libibverbs1 (>=1.1.2-1+OFED), librdmacm1 (>=1.0.8-1+OFED)
Description: A tool to measure network bandwidth and latency
qperf measures network bandwidth and latency between two nodes.
It can work over TCP/IP as well as Infiniband RDMA transports.
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/qperf.spec
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/qperf.spec 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/qperf.spec 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,10 +1,10 @@
Name: qperf
Summary: Measure socket and RDMA performance
-Version: 0.4.1
-Release: 1.ofed1.4.rc1
+Version: 0.4.4
+Release: 1.ofed1.4.1
License: BSD 3-Clause, GPL v2
Group: Networking/Diagnostic
-Source: http://www.openfabrics.org/downloads/qperf-0.4.1.tar.gz
+Source: http://www.openfabrics.org/downloads/qperf-0.4.4.tar.gz
Url: http://www.openfabrics.org
BuildRoot: %{_tmppath}/%{name}-%{version}-build
BuildRequires: libibverbs-devel
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/help.txt
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/help.txt 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/help.txt 2009-05-29 14:04:56 UTC (rev 294)
@@ -94,6 +94,9 @@
udp_lat
ver_rc_compare_swap
ver_rc_fetch_add
+ xrc_bi_bw
+ xrc_bw
+ xrc_lat
Examples
In these examples, we first run qperf on a node called myserver in server
mode by invoking it with no arguments. In all the subsequent examples, we
@@ -109,29 +112,35 @@
qperf myserver ud_lat ud_bw
* To measure RDMA UC bi-directional bandwidth:
qperf myserver rc_bi_bw
+ * To get a range of TCP latencies with a message size from 1 to 64K
+ qperf myserver -oo msg_size:1:64K:*2 -vu tcp_lat
Opts
--access_recv OnOff (-ar) Turn on/off accessing received data
- -aro Cause received data to be accessed
+ -ar1 Cause received data to be accessed
+ --alt_port Port (-ap) Set alternate path port
+ --loc_alt_port Port (-lap) Set local alternate path port
+ --rem_alt_port Port (-rap) Set remote alternate path port
--cpu_affinity PN (-ca) Set processor affinity
--loc_cpu_affinity PN (-lca) Set local processor affinity
--rem_cpu_affinity PN (-rca) Set remote processor affinity
--flip OnOff (-f) Flip on/off sender and receiver
- -fo Flip (on) sender and receiver
+ -f1 Flip (on) sender and receiver
--help Topic (-h) Get more information on a topic
--host Node (-H) Identify server node
--id Device:Port (-i) Set RDMA device and port
--loc_id Device:Port (-li) Set local RDMA device and port
--rem_id Device:Port (-ri) Set remote RDMA device and port
--listen_port Port (-lp) Set server listen port
+ --loop Var:Init:Last:Incr (-oo) Sequence through values
--msg_size Size (-m) Set message size
--mtu_size Size (-mt) Set MTU size (RDMA only)
--no_msgs Count (-n) Send Count messages
--cq_poll OnOff Set polling mode on/off
--loc_cq_poll OnOff (-lcp) Set local polling mode on/off
--rem_cq_poll OnOff (-rcp) Set remote polling mode on/off
- -cpo Turn polling mode on
- -lcpo Turn local polling mode on
- -rcpo Turn remote polling mode on
+ -cp1 Turn polling mode on
+ -lcp1 Turn local polling mode on
+ -rcp1 Turn remote polling mode on
--ip_port Port (-ip) Set TCP port used for tests
--precision Digits (-e) Set precision reported
--rd_atomic Max (-nr) Set RDMA read/atomic count
@@ -143,6 +152,9 @@
--sock_buf_size Size (-sb) Set socket buffer size
--loc_sock_buf_size Size (-lsb) Set local socket buffer size
--rem_sock_buf_size Size (-rsb) Set remote socket buffer size
+ --src_path_bits num (-sp) Set source path bits
+ --loc_src_path_bits num (-lsp) Set local source path bits
+ --rem_src_path_bits num (-rsp) Set remote source path bits
--static_rate (-sr) Set IB static rate
--loc_static_rate (-lsr) Set local IB static rate
--rem_static_rate (-rsr) Set remote IB static rate
@@ -154,7 +166,7 @@
--unify_units (-uu) Unify units
--use_bits_per_sec (-ub) Use bits/sec rather than bytes/sec
--use_cm OnOff (-cm) Use RDMA Connection Manager or not
- -cmo Use RDMA Connection Manager
+ -cm1 Use RDMA Connection Manager
--verbose (-v) Verbose; turn on all of -v[cstu]
--verbose_conf (-vc) Show configuration information
--verbose_stat (-vs) Show statistical information
@@ -172,8 +184,14 @@
If OnOff is non-zero, data is accessed once received. Otherwise,
data is ignored. By default, OnOff is 0. This can help to mimic
some applications.
- -aro
+ -ar1
Cause received data to be accessed.
+ --alt_port Port (-ap)
+ Set alternate path port. This enables automatic path failover.
+ --loc_alt_port Port (-lap)
+ Set local alternate path port. This enables automatic path failover.
+ --rem_alt_port Port (-rap)
+ Set remote alternate path port. This enables automatic path failover.
--cpu_affinity PN (-ca)
Set cpu affinity to PN. CPUs are numbered sequentially from 0. If
PN is "any", any cpu is allowed otherwise the cpu is limited to the
@@ -184,7 +202,7 @@
Set remote processor affinity to PN.
--flip OnOff (-f)
If non-zero, cause sender and receiver to play opposite roles.
- -fo
+ -f1
Cause sender and receiver to play opposite roles.
--help Topic (-h)
Print out information about Topic. To see the list of topics, type
@@ -202,6 +220,11 @@
Set the port we listen on to ListenPort. This must be set to the
same port on both the server and client machines. The default value
is 19765.
+ --loop Var:Init:Last:Incr (-oo)
+ Run a test multiple times sequencing through a series of values. Var
+ is the loop variable; Init is the initial value; Last is the value it
+ must not exceed and Incr is the increment. It is useful to set the
+ --verbose_used (-vu) option in conjunction with this option.
--msg_size Size (-m)
Set the message size to Size. The default value varies by test. It
is assumed that the value is specified in bytes however, a trailing
@@ -222,11 +245,11 @@
Locally turn polling mode on or off.
--rem_cq_poll OnOff (-rcp)
Remotely turn polling mode on or off.
- -cpo
+ -cp1
Turn polling mode on.
- -lcpo
+ -lcp1
Turn local polling mode on.
- -rcpo
+ -rcp1
Turn remote polling mode on.
--ip_port Port (-ip)
Use Port to run the socket tests. This is different from
@@ -258,6 +281,13 @@
Set local socket buffer size.
--rem_sock_buf_size Size (-rsb)
Set remote socket buffer size.
+ --src_path_bits N (-sp)
+ Set source path bits. If the LMC is not zero, this will cause the
+ connection to use a LID with the low order LMC bits set to N.
+ --loc_src_path_bits N (-lsp)
+ Set local source path bits.
+ --rem_src_path_bits N (-rsp)
+ Set remote source path bits.
--static_rate Rate (-sr)
Force InfiniBand static rate. Rate can be one of: 2.5, 5, 10, 20,
30, 40, 60, 80, 120, 1xSDR (2.5 Gbps), 1xDDR (5 Gbps), 1xQDR (10
@@ -294,7 +324,7 @@
necessary to use the CM for iWARP devices. The default is to
establish the connection without using the CM. This only works for
the tests that use the RC transport.
- -cmo
+ -cm1
Use RDMA Connection Manager.
--verbose (-v)
Provide more detailed output. Turns on -vc, -vs, -vt and -vu.
@@ -321,7 +351,7 @@
The current version of qperf is printed.
--wait_server Time (-ws)
If the server is not ready, continue to try connecting for Time
- seconds before giving up.
+ seconds before giving up. The default is 5 seconds.
Tests -RDMA
Miscellaneous
conf Show configuration
@@ -353,15 +383,18 @@
udp_bw UDP streaming one way bandwidth
udp_lat UDP one way latency
RDMA Send/Receive
- ud_bw UD streaming one way bandwidth
- ud_bi_bw UD streaming two way bandwidth
- ud_lat UD one way latency
+ rc_bi_bw RC streaming two way bandwidth
rc_bw RC streaming one way bandwidth
- rc_bi_bw RC streaming two way bandwidth
rc_lat RC one way latency
+ uc_bi_bw UC streaming two way bandwidth
uc_bw UC streaming one way bandwidth
- uc_bi_bw UC streaming two way bandwidth
uc_lat UC one way latency
+ ud_bi_bw UD streaming two way bandwidth
+ ud_bw UD streaming one way bandwidth
+ ud_lat UD one way latency
+ xrc_bi_bw XRC streaming two way bandwidth
+ xrc_bw XRC streaming one way bandwidth
+ xrc_lat XRC one way latency
RDMA
rc_rdma_read_bw RC RDMA read streaming one way bandwidth
rc_rdma_read_lat RC RDMA read one way latency
@@ -869,14 +902,15 @@
--cq_poll OnOff Set polling mode on/off
--time (-t) Set test duration
Other Options
- --cpu_affinity, --listen_port, --mtu_size, --rd_atomic, --static_rate,
- --timeout
+ --cpu_affinity, --listen_port, --msg_size, --mtu_size, --rd_atomic,
+ --static_rate, --timeout
Display Options
--precision, --unify_nodes, --unify_units, --verbose
Description
Test the RC Compare and Swap Atomic operation. The server's memory
- location starts with zero and the client successively exchanges, 0 for
- 1, 1 for 2, etc. The results are checked for correctness.
+ location starts with zero and the client successively makes exchanges
+ with a variety of different values. The results are checked for
+ correctness.
ver_rc_fetch_add +RDMA
Purpose
Verify RC fetch and add
@@ -885,8 +919,8 @@
--cq_poll OnOff Set polling mode on/off
--time (-t) Set test duration
Other Options
- --cpu_affinity, --listen_port, --mtu_size, --rd_atomic, --static_rate,
- --timeout
+ --cpu_affinity, --listen_port, --msg_size, --mtu_size, --rd_atomic,
+ --static_rate, --timeout
Display Options
--precision, --unify_nodes, --unify_units, --use_bits_per_sec,
--verbose
@@ -894,3 +928,52 @@
Tests the RC Fetch and Add Atomic operation. The server's memory
location starts with zero and the client successively adds one. The
results are checked for correctness.
+xrc_bw +RDMA
+ Purpose
+ XRC streaming one way bandwidth
+ Common Options
+ --access_recv OnOff (-ar) Access received data
+ --id Device:Port (-i) Set RDMA device and port
+ --msg_size Size (-m) Set message size
+ --cq_poll OnOff Set polling mode on/off
+ --time (-t) Set test duration
+ Other Options
+ --cpu_affinity, --listen_port, --mtu_size, --static_rate, --timeout
+ Display Options
+ --precision, --unify_nodes, --unify_units, --use_bits_per_sec,
+ --verbose
+ Description
+ The client sends messages to the server who notes how many it received.
+ The XRC Send/Receive mechanism is used.
+xrc_bi_bw +RDMA
+ Purpose
+ XRC streaming two way bandwidth
+ Common Options
+ --access_recv OnOff (-ar) Access received data
+ --id Device:Port (-i) Set RDMA device and port
+ --msg_size Size (-m) Set message size
+ --cq_poll OnOff Set polling mode on/off
+ --time (-t) Set test duration
+ Other Options
+ --cpu_affinity, --listen_port, --mtu_size, --static_rate, --timeout
+ Display Options
+ --precision, --unify_nodes, --unify_units, --use_bits_per_sec,
+ --verbose
+ Description
+ Both the client and server exchange messages with each other using the
+ XRC Send/Receive mechanism and note how many were received.
+xrc_lat +RDMA
+ Purpose
+ XRC one way latency
+ Common Options
+ --id Device:Port (-i) Set RDMA device and port
+ --msg_size Size (-m) Set message size
+ --cq_poll OnOff Set polling mode on/off
+ --time (-t) Set test duration
+ Other Options
+ --cpu_affinity, --listen_port, --mtu_size, --static_rate, --timeout
+ Display Options
+ --precision, --unify_nodes, --unify_units, --verbose
+ Description
+ A ping pong latency test where the server and client exchange messages
+ repeatedly using XRC Send/Receive.
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/mkhelp
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/mkhelp 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/mkhelp 2009-05-29 14:04:56 UTC (rev 294)
@@ -10,8 +10,8 @@
/*
* This was generated from $help_txt. Do not modify directly.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.c
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.c 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.c 2009-05-29 14:04:56 UTC (rev 294)
@@ -2,8 +2,8 @@
* qperf - main.
* Measure socket and RDMA performance.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -62,7 +62,7 @@
*/
#define VER_MAJ 0 /* Major version */
#define VER_MIN 4 /* Minor version */
-#define VER_INC 1 /* Incremental version */
+#define VER_INC 4 /* Incremental version */
#define LISTENQ 5 /* Size of listen queue */
#define BUFSIZE 1024 /* Size of buffers */
@@ -88,6 +88,19 @@
/*
+ * Used to loop through a range of values.
+ */
+typedef struct LOOP {
+ struct LOOP *next; /* Pointer to next loop */
+ OPTION *option; /* Loop variable */
+ long init; /* Initial value */
+ long last; /* Last value */
+ long incr; /* Increment */
+ int mult; /* If set, multiply, otherwise add */
+} LOOP;
+
+
+/*
* Parameter information.
*/
typedef struct PAR_INFO {
@@ -156,80 +169,85 @@
/*
* Function prototypes.
*/
-static void add_ustat(USTAT *l, USTAT *r);
-static long arg_long(char ***argvp);
-static long arg_size(char ***argvp);
-static char *arg_strn(char ***argvp);
-static long arg_time(char ***argvp);
-static void calc_node(RESN *resn, STAT *stat);
-static void calc_results(void);
-static void client(TEST *test);
-static int cmpsub(char *s2, char *s1);
-static char *commify(char *data);
-static void dec_req_data(REQ *host);
-static void dec_req_version(REQ *host);
-static void dec_stat(STAT *host);
-static void dec_ustat(USTAT *host);
-static void do_args(char *args[]);
-static void do_option(OPTION *option, char ***argvp);
-static void enc_req(REQ *host);
-static void enc_stat(STAT *host);
-static void enc_ustat(USTAT *host);
-static TEST *find_test(char *name);
-static OPTION *find_option(char *name);
-static void get_conf(CONF *conf);
-static void get_cpu(CONF *conf);
-static void get_times(CLOCK timex[T_N]);
-static void initialize(void);
-static void init_lstat(void);
-static void init_vars(void);
-static int nice_1024(char *pref, char *name, long long value);
+static void add_ustat(USTAT *l, USTAT *r);
+static long arg_long(char ***argvp);
+static long arg_size(char ***argvp);
+static char *arg_strn(char ***argvp);
+static long arg_time(char ***argvp);
+static void calc_node(RESN *resn, STAT *stat);
+static void calc_results(void);
+static void client(TEST *test);
+static int cmpsub(char *s2, char *s1);
+static char *commify(char *data);
+static void dec_req_data(REQ *host);
+static void dec_req_version(REQ *host);
+static void dec_stat(STAT *host);
+static void dec_ustat(USTAT *host);
+static void do_args(char *args[]);
+static void do_loop(LOOP *loop, TEST *test);
+static void do_option(OPTION *option, char ***argvp);
+static void enc_req(REQ *host);
+static void enc_stat(STAT *host);
+static void enc_ustat(USTAT *host);
+static TEST *find_test(char *name);
+static OPTION *find_option(char *name);
+static void get_conf(CONF *conf);
+static void get_cpu(CONF *conf);
+static void get_times(CLOCK timex[T_N]);
+static void initialize(void);
+static void init_lstat(void);
+static char *loop_arg(char **pp);
+static int nice_1024(char *pref, char *name, long long value);
static PAR_INFO *par_info(PAR_INDEX index);
static PAR_INFO *par_set(char *name, PAR_INDEX index);
-static int par_isset(PAR_INDEX index);
-static void place_any(char *pref, char *name, char *unit, char *data,
- char *altn);
-static void place_show(void);
-static void place_val(char *pref, char *name, char *unit, double value);
-static void remotefd_close(void);
-static void remotefd_setup(void);
-static void run_client_conf(void);
-static void run_client_quit(void);
-static void run_server_conf(void);
-static void run_server_quit(void);
-static void server(void);
-static void server_listen(void);
-static int server_recv_request(void);
-static void set_affinity(void);
-static void set_signals(void);
-static void show_debug(void);
-static void show_info(MEASURE measure);
-static void show_rest(void);
-static void show_used(void);
-static void sig_alrm(int signo, siginfo_t *siginfo, void *ucontext);
-static void sig_quit(int signo, siginfo_t *siginfo, void *ucontext);
-static void sig_urg(int signo, siginfo_t *siginfo, void *ucontext);
-static char *skip_colon(char *s);
-static void start_test_timer(int seconds);
-static void strncopy(char *d, char *s, int n);
-static int verbose(int type, double value);
-static void version_error(void);
-static void view_band(int type, char *pref, char *name, double value);
-static void view_cost(int type, char *pref, char *name, double value);
-static void view_cpus(int type, char *pref, char *name, double value);
-static void view_rate(int type, char *pref, char *name, double value);
-static void view_long(int type, char *pref, char *name, long long value);
-static void view_size(int type, char *pref, char *name, long long value);
-static void view_strn(int type, char *pref, char *name, char *value);
-static void view_time(int type, char *pref, char *name, double value);
+static int par_isset(PAR_INDEX index);
+static void parse_loop(char ***argvp);
+static void place_any(char *pref, char *name, char *unit, char *data,
+ char *altn);
+static void place_show(void);
+static void place_val(char *pref, char *name, char *unit, double value);
+static void remotefd_close(void);
+static void remotefd_setup(void);
+static void run_client_conf(void);
+static void run_client_quit(void);
+static void run_server_conf(void);
+static void run_server_quit(void);
+static void server(void);
+static void server_listen(void);
+static int server_recv_request(void);
+static void set_affinity(void);
+static void set_signals(void);
+static void show_debug(void);
+static void show_info(MEASURE measure);
+static void show_rest(void);
+static void show_used(void);
+static void sig_alrm(int signo, siginfo_t *siginfo, void *ucontext);
+static void sig_quit(int signo, siginfo_t *siginfo, void *ucontext);
+static void sig_urg(int signo, siginfo_t *siginfo, void *ucontext);
+static char *skip_colon(char *s);
+static void start_test_timer(int seconds);
+static long str_size(char *arg, char *str);
+static void strncopy(char *d, char *s, int n);
+static char *two_args(char ***argvp);
+static int verbose(int type, double value);
+static void version_error(void);
+static void view_band(int type, char *pref, char *name, double value);
+static void view_cost(int type, char *pref, char *name, double value);
+static void view_cpus(int type, char *pref, char *name, double value);
+static void view_rate(int type, char *pref, char *name, double value);
+static void view_long(int type, char *pref, char *name, long long value);
+static void view_size(int type, char *pref, char *name, long long value);
+static void view_strn(int type, char *pref, char *name, char *value);
+static void view_time(int type, char *pref, char *name, double value);
/*
* Configurable variables.
*/
-static int ListenPort = DEF_LISTEN_PORT;
-static int Precision = DEF_PRECISION;
-static int UseBitsPerSec = 0;
+static int ListenPort = DEF_LISTEN_PORT;
+static int Precision = DEF_PRECISION;
+static int ServerWait = DEF_TIMEOUT;
+static int UseBitsPerSec = 0;
/*
@@ -238,6 +256,7 @@
static REQ RReq;
static STAT IStat;
static int ListenFD;
+static LOOP *Loops;
static int ProcStatFD;
static STAT RStat;
static int ShowIndex;
@@ -248,7 +267,6 @@
static int VerboseStat;
static int VerboseTime;
static int VerboseUsed;
-static int Wait;
/*
@@ -273,6 +291,7 @@
PAR_NAME ParName[] ={
{ "access_recv", L_ACCESS_RECV, R_ACCESS_RECV },
{ "affinity", L_AFFINITY, R_AFFINITY },
+ { "alt_port", L_ALT_PORT, R_ALT_PORT },
{ "flip", L_FLIP, R_FLIP },
{ "id", L_ID, R_ID },
{ "msg_size", L_MSG_SIZE, R_MSG_SIZE },
@@ -283,6 +302,7 @@
{ "rd_atomic", L_RD_ATOMIC, R_RD_ATOMIC },
{ "service_level", L_SL, R_SL },
{ "sock_buf_size", L_SOCK_BUF_SIZE, R_SOCK_BUF_SIZE },
+ { "src_path_bits", L_SRC_PATH_BITS, R_SRC_PATH_BITS },
{ "time", L_TIME, R_TIME },
{ "timeout", L_TIMEOUT, R_TIMEOUT },
{ "use_cm", L_USE_CM, R_USE_CM },
@@ -299,6 +319,8 @@
{ R_ACCESS_RECV, 'l', &RReq.access_recv },
{ L_AFFINITY, 'l', &Req.affinity },
{ R_AFFINITY, 'l', &RReq.affinity },
+ { L_ALT_PORT, 'l', &Req.alt_port },
+ { R_ALT_PORT, 'l', &RReq.alt_port },
{ L_FLIP, 'l', &Req.flip },
{ R_FLIP, 'l', &RReq.flip },
{ L_ID, 'p', &Req.id },
@@ -319,6 +341,8 @@
{ R_SL, 'l', &RReq.sl },
{ L_SOCK_BUF_SIZE, 's', &Req.sock_buf_size },
{ R_SOCK_BUF_SIZE, 's', &RReq.sock_buf_size },
+ { L_SRC_PATH_BITS, 's', &Req.src_path_bits },
+ { R_SRC_PATH_BITS, 's', &RReq.src_path_bits },
{ L_STATIC_RATE, 'p', &Req.static_rate },
{ R_STATIC_RATE, 'p', &RReq.static_rate },
{ L_TIME, 't', &Req.time },
@@ -378,6 +402,13 @@
{ "-vS", "-vvs", },
{ "-vT", "-vvt", },
{ "-vU", "-vvu", },
+ /* options that are on */
+ { "-aro", "-ar1" },
+ { "-cmo", "-cm1" },
+ { "-fo", "-f1" },
+ { "-cpo", "-cp1" },
+ { "-lcpo", "-lcp1" },
+ { "-rcpo", "-rcp1" },
/* miscellaneous */
{ "-Ar", "-ar" },
{ "-M", "-mt" },
@@ -393,7 +424,13 @@
OPTION Options[] ={
{ "--access_recv", "int", L_ACCESS_RECV, R_ACCESS_RECV },
{ "-ar", "int", L_ACCESS_RECV, R_ACCESS_RECV },
- { "-aro", "set1", L_ACCESS_RECV, R_ACCESS_RECV },
+ { "-ar1", "set1", L_ACCESS_RECV, R_ACCESS_RECV },
+ { "--alt_port", "int", L_ALT_PORT, R_ALT_PORT },
+ { "-ap", "int", L_ALT_PORT, R_ALT_PORT },
+ { "--loc_alt_port", "int", L_ALT_PORT, },
+ { "-lap", "int", L_ALT_PORT, },
+ { "--rem_alt_port", "int", R_ALT_PORT },
+ { "-rap", "int", R_ALT_PORT },
{ "--cpu_affinity", "int", L_AFFINITY, R_AFFINITY },
{ "-ca", "int", L_AFFINITY, R_AFFINITY },
{ "--loc_cpu_affinity", "int", L_AFFINITY, },
@@ -404,7 +441,7 @@
{ "-D", "Sdebug", },
{ "--flip", "int", L_FLIP, R_FLIP },
{ "-f", "int", L_FLIP, R_FLIP },
- { "-fo", "set1", L_FLIP, R_FLIP },
+ { "-f1", "set1", L_FLIP, R_FLIP },
{ "--help", "help" },
{ "-h", "help" },
{ "--host", "host", },
@@ -417,6 +454,8 @@
{ "-ri", "str", R_ID },
{ "--listen_port", "Slp", },
{ "-lp", "Slp", },
+ { "--loop", "loop", },
+ { "-oo", "loop", },
{ "--msg_size", "size", L_MSG_SIZE, R_MSG_SIZE },
{ "-m", "size", L_MSG_SIZE, R_MSG_SIZE },
{ "--mtu_size", "size", L_MTU_SIZE, R_MTU_SIZE },
@@ -425,13 +464,13 @@
{ "-n", "int", L_NO_MSGS, R_NO_MSGS },
{ "--cq_poll", "int", L_POLL_MODE, R_POLL_MODE },
{ "-cp", "int", L_POLL_MODE, R_POLL_MODE },
- { "-cpo", "set1", L_POLL_MODE, R_POLL_MODE },
+ { "-cp1", "set1", L_POLL_MODE, R_POLL_MODE },
{ "--loc_cq_poll", "int", L_POLL_MODE, },
{ "-lcp", "int", L_POLL_MODE, },
- { "-lcpo", "set1", L_POLL_MODE },
+ { "-lcp1", "set1", L_POLL_MODE },
{ "--rem_cq_poll", "int", R_POLL_MODE },
{ "-rcp", "int", R_POLL_MODE },
- { "-rcpo", "set1", R_POLL_MODE },
+ { "-rcp1", "set1", R_POLL_MODE },
{ "--ip_port", "int", L_PORT, R_PORT },
{ "-ip", "int", L_PORT, R_PORT },
{ "--precision", "precision", },
@@ -454,6 +493,12 @@
{ "-lsb", "size", L_SOCK_BUF_SIZE },
{ "--rem_sock_buf_size", "size", R_SOCK_BUF_SIZE },
{ "-rsb", "size", R_SOCK_BUF_SIZE },
+ { "--src_path_bits", "size", L_SRC_PATH_BITS, R_SRC_PATH_BITS },
+ { "-sp", "size", L_SRC_PATH_BITS, R_SRC_PATH_BITS },
+ { "--loc_src_path_bits", "size", L_SRC_PATH_BITS },
+ { "-lsp", "size", L_SRC_PATH_BITS },
+ { "--rem_src_path_bits", "size", R_SRC_PATH_BITS },
+ { "-rsp", "size", R_SRC_PATH_BITS },
{ "--static_rate", "str", L_STATIC_RATE, R_STATIC_RATE },
{ "-sr", "str", L_STATIC_RATE, R_STATIC_RATE },
{ "--loc_static_rate", "str", L_STATIC_RATE },
@@ -476,7 +521,7 @@
{ "-ub", "ub", },
{ "--use_cm", "int", L_USE_CM, R_USE_CM },
{ "-cm", "int", L_USE_CM, R_USE_CM },
- { "-cmo", "set1", L_USE_CM, R_USE_CM },
+ { "-cm1", "set1", L_USE_CM, R_USE_CM },
{ "--verbose", "v", },
{ "-v", "v", },
{ "--verbose_conf", "vc", },
@@ -543,6 +588,9 @@
test(ud_lat),
test(ver_rc_compare_swap),
test(ver_rc_fetch_add),
+ test(xrc_bi_bw),
+ test(xrc_bw),
+ test(xrc_lat),
#endif
};
@@ -558,21 +606,11 @@
/*
- * Initialize.
+ * Initialize variables.
*/
static void
initialize(void)
{
- init_vars();
-}
-
-
-/*
- * Initialize variables.
- */
-static void
-init_vars(void)
-{
int i;
RemoteFD = -1;
@@ -595,6 +633,7 @@
{
for (;;) {
int c = *s++;
+
if (c == ':')
break;
if (c == '\0')
@@ -616,6 +655,7 @@
for (;;) {
int c1 = *s1++;
int c2 = *s2++;
+
if (c1 == '\0')
return 1;
if (c2 == '\0')
@@ -674,7 +714,7 @@
static void
sig_urg(int signo, siginfo_t *siginfo, void *ucontext)
{
- urgent_error();
+ urgent();
}
@@ -701,15 +741,17 @@
if (!ServerName)
ServerName = arg;
else {
- TEST *p = find_test(arg);
- if (!p)
+ TEST *test = find_test(arg);
+
+ if (!test)
error(0, "%s: bad test; try: qperf --help tests", arg);
- client(p);
+ do_loop(Loops, test);
testSpecified = 1;
}
++args;
}
}
+
if (!isClient)
server();
else if (!testSpecified) {
@@ -724,6 +766,34 @@
/*
+ * Loop through a series of tests.
+ */
+static void
+do_loop(LOOP *loop, TEST *test)
+{
+ if (!loop)
+ client(test);
+ else {
+ long l = loop->init;
+
+ while (l <= loop->last) {
+ char buf[64];
+ char *args[2] = {loop->option->name, buf};
+ char **argv = args;
+
+ snprintf(buf, sizeof(buf), "%ld", l);
+ do_option(loop->option, &argv);
+ do_loop(loop->next, test);
+ if (loop->mult)
+ l *= loop->incr;
+ else
+ l += loop->incr;
+ }
+ }
+}
+
+
+/*
* Given the name of an option, find it.
*/
static OPTION *
@@ -761,6 +831,7 @@
{
int n = cardof(Tests);
TEST *p = Tests;
+
for (; n--; ++p)
if (streq(name, p->name))
return p;
@@ -786,6 +857,7 @@
/* Help */
char **usage;
char *category = (*argvp)[1];
+
if (!category)
category = "main";
for (usage = Usage; *usage; usage += 2)
@@ -804,6 +876,8 @@
long v = arg_long(argvp);
setp_u32(option->name, option->arg1, v);
setp_u32(option->name, option->arg2, v);
+ } else if (streq(t, "loop")) {
+ parse_loop(argvp);
} else if (streq(t, "lp")) {
ListenPort = arg_long(argvp);
} else if (streq(t, "precision")) {
@@ -883,36 +957,126 @@
VerboseUsed = 2;
*argvp += 1;
} else if (streq(t, "wait")) {
- Wait = arg_time(argvp);
+ ServerWait = arg_time(argvp);
} else
error(BUG, "do_option: unknown type: %s", t);
}
/*
- * If any options were set but were not used, print out a warning message for
- * the user.
+ * Parse a loop option.
*/
-void
-opt_check(void)
+static void
+parse_loop(char ***argvp)
{
- PAR_INFO *p;
- PAR_INFO *q;
- PAR_INFO *r = endof(ParInfo);
+ char *opt = **argvp;
+ char *s = two_args(argvp);
+ char *name = loop_arg(&s);
+ char *init = loop_arg(&s);
+ char *last = loop_arg(&s);
+ char *incr = loop_arg(&s);
+ LOOP *loop = qmalloc(sizeof(LOOP));
- for (p = ParInfo; p < r; ++p) {
- if (p->used || !p->set)
- continue;
- error(RET, "warning: %s set but not used in test %s",
- p->name, TestName);
- for (q = p+1; q < r; ++q)
- if (q->set && q->name == p->name)
- q->set = 0;
+ memset(loop, 0, sizeof(*loop));
+
+ /* Parse variable name */
+ {
+ int n = cardof(Options);
+ OPTION *p = Options;
+
+ if (!name)
+ name = "msg_size";
+ for (;;) {
+ char *s = p->name;
+
+ if (n-- == 0)
+ error(0, "%s: %s: no such variable", opt, name);
+ if (*s++ != '-')
+ continue;
+ if (*s == '-')
+ s++;
+ if (streq(name, s))
+ break;
+ p++;
+ }
+ loop->option = p;
}
+
+ /* Parse increment */
+ if (!incr)
+ loop->incr = 0;
+ else {
+ if (incr[0] == '*') {
+ incr++;
+ loop->mult = 1;
+ }
+ loop->incr = str_size(incr, opt);
+ if (loop->incr < 1)
+ error(0, "%s: %s: increment must be positive", opt, incr);
+ }
+
+ /* Parse initial value */
+ if (init)
+ loop->init = str_size(init, opt);
+ else
+ loop->init = loop->mult ? 1 : 0;
+
+ /* Parse last value */
+ if (!last)
+ error(0, "%s: must specify limit", opt);
+ loop->last = str_size(last, opt);
+
+ /* Insert into loop list */
+ if (!Loops)
+ Loops = loop;
+ else {
+ LOOP *l = Loops;
+
+ while (l->next)
+ l = l->next;
+ l->next = loop;
+ }
}
/*
+ * Given a string consisting of arguments separated by colons, return the next
+ * argument and prepare for scanning the next one.
+ */
+static char *
+loop_arg(char **pp)
+{
+ char *a = *pp;
+ char *p = a;
+
+ while (*p) {
+ if (*p == ':') {
+ *p = '\0';
+ *pp = p + 1;
+ break;
+ }
+ ++p;
+ }
+ return a[0] ? a : 0;
+}
+
+
+/*
+ * Ensure that two arguments exist.
+ */
+static char *
+two_args(char ***argvp)
+{
+ char **argv = *argvp;
+
+ if (!argv[1])
+ error(0, "%s: missing argument", argv[0]);
+ *argvp += 2;
+ return argv[1];
+}
+
+
+/*
* Return the value of a long argument. It must be non-negative.
*/
static long
@@ -940,37 +1104,47 @@
static long
arg_size(char ***argvp)
{
- char *p;
- long double d;
- long l = 0;
+ long l;
char **argv = *argvp;
+ *argvp += 2;
if (!argv[1])
error(0, "missing argument to %s", argv[0]);
- d = strtold(argv[1], &p);
- if (d < 0)
+ l = str_size(argv[1], argv[0]);
+ if (l < 0)
error(0, "%s requires a non-negative number", argv[0]);
+ return l;
+}
+
+/*
+ * Scan a size argument from a string.
+ */
+static long
+str_size(char *str, char *arg)
+{
+ char *p;
+ long m = 1;
+ long double d = strtold(str, &p);
+
if (p[0] == '\0')
- l = d;
- else {
- if (streq(p, "kb") || streq(p, "k"))
- l = (long)(d * (1000));
- else if (streq(p, "mb") || streq(p, "m"))
- l = (long)(d * (1000 * 1000));
- else if (streq(p, "gb") || streq(p, "g"))
- l = (long)(d * (1000 * 1000 * 1000));
- else if (streq(p, "kib") || streq(p, "K"))
- l = (long)(d * (1024));
- else if (streq(p, "mib") || streq(p, "M"))
- l = (long)(d * (1024 * 1024));
- else if (streq(p, "gib") || streq(p, "G"))
- l = (long)(d * (1024 * 1024 * 1024));
- else
- error(0, "bad argument: %s", argv[1]);
- }
- *argvp += 2;
- return l;
+ m = 1;
+ else if (streq(p, "kb") || streq(p, "k"))
+ m = 1000;
+ else if (streq(p, "mb") || streq(p, "m"))
+ m = 1000 * 1000;
+ else if (streq(p, "gb") || streq(p, "g"))
+ m = 1000 * 1000 * 1000;
+ else if (streq(p, "kib") || streq(p, "K"))
+ m = 1024;
+ else if (streq(p, "mib") || streq(p, "M"))
+ m = 1024 * 1024;
+ else if (streq(p, "gib") || streq(p, "G"))
+ m = 1024 * 1024 * 1024;
+ else
+ error(0, "%s: bad size: %s", arg, str);
+
+ return d * m;
}
@@ -981,6 +1155,7 @@
arg_strn(char ***argvp)
{
char **argv = *argvp;
+
if (!argv[1])
error(0, "missing argument to %s", argv[0]);
*argvp += 2;
@@ -996,11 +1171,12 @@
{
char *p;
long double d;
-
long l = 0;
char **argv = *argvp;
+
if (!argv[1])
error(0, "missing argument to %s", argv[0]);
+
d = strtold(argv[1], &p);
if (d < 0)
error(0, "%s requires a non-negative number", argv[0]);
@@ -1022,6 +1198,7 @@
else
error(0, "bad argument: %s", argv[1]);
}
+
*argvp += 2;
return l;
}
@@ -1045,6 +1222,7 @@
setp_u32(char *name, PAR_INDEX index, uint32_t l)
{
PAR_INFO *p = par_set(name, index);
+
if (!p)
return;
*((uint32_t *)p->ptr) = l;
@@ -1058,6 +1236,7 @@
setp_str(char *name, PAR_INDEX index, char *s)
{
PAR_INFO *p = par_set(name, index);
+
if (!p)
return;
if (strlen(s) >= STRSIZE)
@@ -1073,6 +1252,7 @@
par_use(PAR_INDEX index)
{
PAR_INFO *p = par_info(index);
+
p->used = 1;
p->inuse = 1;
}
@@ -1085,6 +1265,7 @@
par_set(char *name, PAR_INDEX index)
{
PAR_INFO *p = par_info(index);
+
if (index == P_NULL)
return 0;
if (name) {
@@ -1125,6 +1306,29 @@
/*
+ * If any options were set but were not used, print out a warning message for
+ * the user.
+ */
+void
+opt_check(void)
+{
+ PAR_INFO *p;
+ PAR_INFO *q;
+ PAR_INFO *r = endof(ParInfo);
+
+ for (p = ParInfo; p < r; ++p) {
+ if (p->used || !p->set)
+ continue;
+ error(RET, "warning: %s set but not used in test %s",
+ p->name, TestName);
+ for (q = p+1; q < r; ++q)
+ if (q->set && q->name == p->name)
+ q->set = 0;
+ }
+}
+
+
+/*
* Server.
*/
static void
@@ -1137,7 +1341,7 @@
TEST *test;
int s = offset(REQ, req_index);
- debug("waiting for request");
+ debug("ready for requests");
if (!server_recv_request())
continue;
pid = fork();
@@ -1164,12 +1368,10 @@
test = &Tests[Req.req_index];
TestName = test->name;
- debug("request is %s", TestName);
+ debug("received request: %s", TestName);
init_lstat();
- Finished = 0;
set_affinity();
(test->server)();
- stop_test_timer();
exit(0);
}
close(ListenFD);
@@ -1217,8 +1419,8 @@
.ai_family = AF_UNSPEC,
.ai_socktype = SOCK_STREAM
};
+ AI *ailist = getaddrinfo_port(0, ListenPort, &hints);
- AI *ailist = getaddrinfo_port(0, ListenPort, &hints);
for (ai = ailist; ai; ai = ai->ai_next) {
ListenFD = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
if (ListenFD < 0)
@@ -1283,10 +1485,9 @@
RReq.ver_inc = VER_INC;
RReq.req_index = test - Tests;
TestName = test->name;
- debug("sending request %s", TestName);
+ debug("sending request: %s", TestName);
init_lstat();
printf("%s:\n", TestName);
- Finished = 0;
(*test->client)();
remotefd_close();
place_show();
@@ -1305,32 +1506,37 @@
.ai_family = AF_UNSPEC,
.ai_socktype = SOCK_STREAM
};
+ AI *ailist = getaddrinfo_port(ServerName, ListenPort, &hints);
- AI *ailist = getaddrinfo_port(ServerName, ListenPort, &hints);
RemoteFD = -1;
- if (Wait)
- start_test_timer(Wait);
+ if (ServerWait)
+ start_test_timer(ServerWait);
for (;;) {
for (a = ailist; a; a = a->ai_next) {
+ if (Finished)
+ break;
RemoteFD = socket(a->ai_family, a->ai_socktype, a->ai_protocol);
- if (RemoteFD >= 0) {
- if (connect(RemoteFD, a->ai_addr, a->ai_addrlen) == SUCCESS0) {
- ServerAddrLen = a->ai_addrlen;
- memcpy(&ServerAddr, a->ai_addr, ServerAddrLen);
- break;
- }
+ if (RemoteFD < 0)
+ continue;
+ if (connect(RemoteFD, a->ai_addr, a->ai_addrlen) != SUCCESS0) {
remotefd_close();
+ continue;
}
+ ServerAddrLen = a->ai_addrlen;
+ memcpy(&ServerAddr, a->ai_addr, ServerAddrLen);
+ break;
}
- if (RemoteFD >= 0 || !Wait || Finished)
+ if (RemoteFD >= 0 || !ServerWait || Finished)
break;
sleep(1);
}
- if (Wait)
+
+ if (ServerWait)
stop_test_timer();
freeaddrinfo(ailist);
+
if (RemoteFD < 0)
- error(0, "failed to connect");
+ error(0, "%s: failed to connect", ServerName);
remotefd_setup();
enc_init(&req);
enc_req(&RReq);
@@ -1345,6 +1551,7 @@
remotefd_setup(void)
{
int one = 1;
+
if (ioctl(RemoteFD, FIONBIO, &one) < 0)
error(SYS, "ioctl FIONBIO failed");
if (fcntl(RemoteFD, F_SETOWN, getpid()) < 0)
@@ -1378,12 +1585,12 @@
recv_mesg(&stat, sizeof(stat), "results");
dec_init(&stat);
dec_stat(&RStat);
- send_sync("results");
+ send_sync("synchronization after test");
} else {
enc_init(&stat);
enc_stat(&LStat);
send_mesg(&stat, sizeof(stat), "results");
- recv_sync("results");
+ recv_sync("synchronization after test");
}
}
@@ -1428,6 +1635,7 @@
run_server_conf(void)
{
CONF conf;
+
get_conf(&conf);
send_mesg(&conf, sizeof(conf), "configuration");
}
@@ -1462,10 +1670,10 @@
char buf[BUFSIZE];
char cpu[BUFSIZE];
char mhz[BUFSIZE];
-
int cpus = 0;
int mixed = 0;
FILE *fp = fopen("/proc/cpuinfo", "r");
+
if (!fp)
error(0, "cannot open /proc/cpuinfo");
cpu[0] = '\0';
@@ -1583,7 +1791,7 @@
void
sync_test(void)
{
- synchronize("test");
+ synchronize("synchronization before test");
start_test_timer(Req.time);
}
@@ -1596,12 +1804,13 @@
{
struct itimerval itimerval = {{0}};
+ Finished = 0;
get_times(LStat.time_s);
setitimer(ITIMER_REAL, &itimerval, 0);
if (!seconds)
return;
- debug("starting timer");
+ debug("starting timer for %d seconds", seconds);
itimerval.it_value.tv_sec = seconds;
itimerval.it_interval.tv_usec = 1;
setitimer(ITIMER_REAL, &itimerval, 0);
@@ -1610,13 +1819,14 @@
/*
* Stop timing. Note that the end time is obtained by the first call to
- * set_finished. In the tests, usually, when SIGALRM goes off, it is executing
- * a read or write system call which gets interrupted. If SIGALRM goes off
- * after Finished is checked but before the system call is performed, the
- * system call will be executed and it will take the second SIGALRM call
- * generated by the interval timer to wake it up. Hence, we save the end times
- * in sig_alrm. Note that if Finished is set, we reject any packets that are
- * sent or arrive in order not to cheat.
+ * set_finished. In the tests, when SIGALRM goes off, it may be executing a
+ * system call which gets interrupted. If SIGALRM goes off after Finished is
+ * checked but before the system call is initiated, the system call will be
+ * executed and it will take the second SIGALRM call generated by the interval
+ * timer to wake it up. Hence, we save the end times in sig_alrm. Note that
+ * if Finished is set, we reject any packets that are sent or arrive in order
+ * not to cheat. We clear Finished since code assumes that it is the default
+ * state.
*/
void
stop_test_timer(void)
@@ -1625,6 +1835,7 @@
set_finished();
setitimer(ITIMER_REAL, &itimerval, 0);
+ Finished = 0;
debug("stopping timer");
}
@@ -2328,6 +2539,7 @@
place_any(char *pref, char *name, char *unit, char *data, char *altn)
{
SHOW *show = &ShowTable[ShowIndex++];
+
if (ShowIndex > cardof(ShowTable))
error(BUG, "need to increase size of ShowTable");
show->pref = pref;
@@ -2422,6 +2634,7 @@
enc_int(host->req_index, sizeof(host->req_index));
enc_int(host->access_recv, sizeof(host->access_recv));
enc_int(host->affinity, sizeof(host->affinity));
+ enc_int(host->alt_port, sizeof(host->alt_port));
enc_int(host->flip, sizeof(host->flip));
enc_int(host->msg_size, sizeof(host->msg_size));
enc_int(host->mtu_size, sizeof(host->mtu_size));
@@ -2431,6 +2644,7 @@
enc_int(host->rd_atomic, sizeof(host->rd_atomic));
enc_int(host->sl, sizeof(host->sl));
enc_int(host->sock_buf_size, sizeof(host->sock_buf_size));
+ enc_int(host->src_path_bits, sizeof(host->src_path_bits));
enc_int(host->time, sizeof(host->time));
enc_int(host->timeout, sizeof(host->timeout));
enc_int(host->use_cm, sizeof(host->use_cm));
@@ -2462,6 +2676,7 @@
host->req_index = dec_int(sizeof(host->req_index));
host->access_recv = dec_int(sizeof(host->access_recv));
host->affinity = dec_int(sizeof(host->affinity));
+ host->alt_port = dec_int(sizeof(host->alt_port));
host->flip = dec_int(sizeof(host->flip));
host->msg_size = dec_int(sizeof(host->msg_size));
host->mtu_size = dec_int(sizeof(host->mtu_size));
@@ -2471,6 +2686,7 @@
host->rd_atomic = dec_int(sizeof(host->rd_atomic));
host->sl = dec_int(sizeof(host->sl));
host->sock_buf_size = dec_int(sizeof(host->sock_buf_size));
+ host->src_path_bits = dec_int(sizeof(host->src_path_bits));
host->time = dec_int(sizeof(host->time));
host->timeout = dec_int(sizeof(host->timeout));
host->use_cm = dec_int(sizeof(host->use_cm));
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.h
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.h 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/qperf.h 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,8 +1,8 @@
/*
* qperf - general header file.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -49,7 +49,7 @@
#define cardof(a) (sizeof(a)/sizeof(*a))
#define endof(a) (&a[cardof(a)])
#define streq(a, b) (strcmp(a, b) == 0)
-#define offset(t, e) ((int)&((t *)0)->e)
+#define offset(t, e) ((long)&((t *)0)->e)
#define is_client() (ServerName != 0)
#define is_sender() (Req.flip ? !is_client() : is_client())
@@ -97,6 +97,8 @@
R_ACCESS_RECV,
L_AFFINITY,
R_AFFINITY,
+ L_ALT_PORT,
+ R_ALT_PORT,
L_FLIP,
R_FLIP,
L_ID,
@@ -117,6 +119,8 @@
R_SL,
L_SOCK_BUF_SIZE,
R_SOCK_BUF_SIZE,
+ L_SRC_PATH_BITS,
+ R_SRC_PATH_BITS,
L_STATIC_RATE,
R_STATIC_RATE,
L_TIME,
@@ -153,6 +157,7 @@
uint16_t req_index; /* Request index (into Tests) */
uint32_t access_recv; /* Access data after receiving */
uint32_t affinity; /* Processor affinity */
+ uint32_t alt_port; /* Alternate path port number */
uint32_t flip; /* Flip sender/receiver */
uint32_t msg_size; /* Message Size */
uint32_t mtu_size; /* MTU Size */
@@ -162,6 +167,7 @@
uint32_t rd_atomic; /* Number of pending RDMA or atomics */
uint32_t sl; /* Service level */
uint32_t sock_buf_size; /* Socket buffer size */
+ uint32_t src_path_bits; /* Source path bits */
uint32_t time; /* Duration in seconds */
uint32_t timeout; /* Timeout for messages */
uint32_t use_cm; /* Use Connection Manager */
@@ -268,7 +274,7 @@
void setsockopt_one(int fd, int optname);
void synchronize(char *msg);
void touch_data(void *p, int n);
-void urgent_error(void);
+void urgent(void);
/*
@@ -343,6 +349,12 @@
void run_server_ver_rc_compare_swap(void);
void run_client_ver_rc_fetch_add(void);
void run_server_ver_rc_fetch_add(void);
+void run_client_xrc_bi_bw(void);
+void run_server_xrc_bi_bw(void);
+void run_client_xrc_bw(void);
+void run_server_xrc_bw(void);
+void run_client_xrc_lat(void);
+void run_server_xrc_lat(void);
/*
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/rdma.c
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/rdma.c 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/rdma.c 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,8 +1,8 @@
/*
* qperf - handle RDMA tests.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -33,6 +33,7 @@
* SOFTWARE.
*/
#define _GNU_SOURCE
+#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
@@ -75,20 +76,18 @@
/*
* For convenience.
*/
-typedef enum ibv_wr_opcode OPCODE;
+typedef enum ibv_wr_opcode ibv_op;
+typedef struct ibv_comp_channel ibv_cc;
+typedef struct ibv_xrc_domain ibv_xrc;
/*
- * Some of the tests.
+ * Atomic operations.
*/
-typedef enum ETEST {
- RC_COMPARE_SWAP_MR,
- RC_FETCH_ADD_MR,
- RC_RDMA_READ_BW,
- RC_RDMA_READ_LAT,
- VER_RC_COMPARE_SWAP,
- VER_RC_FETCH_ADD,
-} ETEST;
+typedef enum ATOMIC {
+ COMPARE_SWAP,
+ FETCH_ADD
+} ATOMIC;
/*
@@ -104,11 +103,14 @@
* Information specific to a node.
*/
typedef struct NODE {
+ uint64_t vaddr; /* Virtual address */
uint32_t lid; /* Local ID */
uint32_t qpn; /* Queue pair number */
uint32_t psn; /* Packet sequence number */
+ uint32_t srqn; /* Shared queue number */
uint32_t rkey; /* Remote key */
- uint64_t vaddr; /* Virtual address */
+ uint32_t alt_lid; /* Alternate Path Local LID */
+ uint32_t rd_atomic; /* Number of read/atomics supported */
} NODE;
@@ -138,21 +140,26 @@
* RDMA device descriptor.
*/
typedef struct DEVICE {
- NODE lnode; /* Local node information */
- NODE rnode; /* Remote node information */
- IBINFO ib; /* InfiniBand information */
- CMINFO cm; /* Connection Manager information */
- int trans; /* QP transport */
- int maxSendWR; /* Maximum send work requests */
- int maxRecvWR; /* Maximum receive work requests */
- int maxInline; /* Maximum amount of inline data */
- char *buffer; /* Buffer */
- struct ibv_comp_channel *channel; /* Channel */
- struct ibv_pd *pd; /* Protection domain */
- struct ibv_mr *mr; /* Memory region */
- struct ibv_cq *cq; /* Completion queue */
- struct ibv_qp *qp; /* QPair */
- struct ibv_ah *ah; /* Address handle */
+ NODE lnode; /* Local node information */
+ NODE rnode; /* Remote node information */
+ IBINFO ib; /* InfiniBand information */
+ CMINFO cm; /* Connection Manager information */
+ uint32_t qkey; /* Q Key for UD */
+ int trans; /* QP transport */
+ int msg_size; /* Message size */
+ int buf_size; /* Buffer size */
+ int max_send_wr; /* Maximum send work requests */
+ int max_recv_wr; /* Maximum receive work requests */
+ int max_inline; /* Maximum amount of inline data */
+ char *buffer; /* Buffer */
+ ibv_cc *channel; /* Channel */
+ struct ibv_pd *pd; /* Protection domain */
+ struct ibv_mr *mr; /* Memory region */
+ struct ibv_cq *cq; /* Completion queue */
+ struct ibv_qp *qp; /* Queue Pair */
+ struct ibv_ah *ah; /* Address handle */
+ struct ibv_srq *srq; /* Shared receive queue */
+ ibv_xrc *xrc; /* XRC domain */
} DEVICE;
@@ -177,6 +184,8 @@
/*
* Function prototypes.
*/
+static void atomic_seq(ATOMIC atomic, int i,
+ uint64_t *value, uint64_t *args);
static void cm_ack_event(DEVICE *dev);
static void cm_close(DEVICE *dev);
static char *cm_event_name(int event, char *data, int size);
@@ -190,31 +199,36 @@
static void dec_node(NODE *host);
static void do_error(int status, uint64_t *errors);
static void enc_node(NODE *host);
-static void ib_client_atomic(ETEST etest);
-static void ib_close(DEVICE *dev);
+static void ib_client_atomic(ATOMIC atomic);
+static void ib_client_verify_atomic(ATOMIC atomic);
+static void ib_close1(DEVICE *dev);
+static void ib_close2(DEVICE *dev);
+static void ib_migrate(DEVICE *dev);
static void ib_open(DEVICE *dev);
-static int ib_poll(DEVICE *dev, struct ibv_wc *wc, int nwc);
-static void ib_post_rdma(DEVICE *dev, OPCODE opcode, int n);
-static void ib_post_compare_swap(DEVICE *dev,
- int wrid, int offset, uint64_t compare, uint64_t swap);
-static void ib_post_fetch_add(DEVICE *dev,
- int wrid, int offset, uint64_t add);
-static void ib_post_recv(DEVICE *dev, int n);
-static void ib_post_send(DEVICE *dev, int n);
+static void ib_post_atomic(DEVICE *dev, ATOMIC atomic, int wrid,
+ int offset, uint64_t compare_add, uint64_t swap);
static void ib_prep(DEVICE *dev);
static void rd_bi_bw(int transport);
static void rd_client_bw(int transport);
-static void rd_client_rdma_bw(int transport, OPCODE opcode);
+static void rd_client_rdma_bw(int transport, ibv_op opcode);
static void rd_client_rdma_read_lat(int transport);
static void rd_close(DEVICE *dev);
-static void rd_open(DEVICE *dev, int trans, int maxSendWR, int maxRecvWR);
-static void rd_params(int transport, long msgSize, int poll, int atomic);
+static void rd_mralloc(DEVICE *dev, int size);
+static void rd_mrfree(DEVICE *dev);
+static void rd_open(DEVICE *dev, int trans, int max_send_wr, int max_recv_wr);
+static void rd_params(int transport, long msg_size, int poll, int atomic);
+static int rd_poll(DEVICE *dev, struct ibv_wc *wc, int nwc);
+static void rd_post_rdma_std(DEVICE *dev, ibv_op opcode, int n);
+static void rd_post_recv_std(DEVICE *dev, int n);
+static void rd_post_send(DEVICE *dev, int off, int len,
+ int inc, int rep, int stat);
+static void rd_post_send_std(DEVICE *dev, int n);
static void rd_pp_lat(int transport, IOMODE iomode);
static void rd_pp_lat_loop(DEVICE *dev, IOMODE iomode);
static void rd_prep(DEVICE *dev, int size);
static void rd_rdma_write_poll_lat(int transport);
static void rd_server_def(int transport);
-static void rd_server_nop(int transport, ETEST etest);
+static void rd_server_nop(int transport, int size);
static int maybe(int val, char *msg);
static char *opcode_name(int opcode);
static void show_node_info(DEVICE *dev);
@@ -367,7 +381,7 @@
void
run_client_rc_compare_swap_mr(void)
{
- ib_client_atomic(RC_COMPARE_SWAP_MR);
+ ib_client_atomic(COMPARE_SWAP);
}
@@ -377,7 +391,7 @@
void
run_server_rc_compare_swap_mr(void)
{
- rd_server_nop(IBV_QPT_RC, RC_COMPARE_SWAP_MR);
+ rd_server_nop(IBV_QPT_RC, sizeof(uint64_t));
}
@@ -387,7 +401,7 @@
void
run_client_rc_fetch_add_mr(void)
{
- ib_client_atomic(RC_FETCH_ADD_MR);
+ ib_client_atomic(FETCH_ADD);
}
@@ -397,7 +411,7 @@
void
run_server_rc_fetch_add_mr(void)
{
- rd_server_nop(IBV_QPT_RC, RC_FETCH_ADD_MR);
+ rd_server_nop(IBV_QPT_RC, sizeof(uint64_t));
}
@@ -444,7 +458,7 @@
void
run_server_rc_rdma_read_bw(void)
{
- rd_server_nop(IBV_QPT_RC, RC_RDMA_READ_BW);
+ rd_server_nop(IBV_QPT_RC, 0);
}
@@ -465,7 +479,7 @@
void
run_server_rc_rdma_read_lat(void)
{
- rd_server_nop(IBV_QPT_RC, RC_RDMA_READ_LAT);
+ rd_server_nop(IBV_QPT_RC, 0);
}
@@ -740,77 +754,94 @@
rd_pp_lat(IBV_QPT_UD, IO_SR);
}
+
/*
- * Verify RC compare and swap (client side).
+ * Measure XRC bi-directional bandwidth (client side).
*/
void
-run_client_ver_rc_compare_swap(void)
+run_client_xrc_bi_bw(void)
{
- int i;
- DEVICE dev;
- uint32_t size;
- uint64_t *result;
- uint64_t last = 0;
- uint64_t cur = 0;
- uint64_t next = 0x0123456789abcdefULL;
+ par_use(L_ACCESS_RECV);
+ par_use(R_ACCESS_RECV);
+ rd_params(IBV_QPT_XRC, K64, 1, 0);
+ rd_bi_bw(IBV_QPT_XRC);
+ show_results(BANDWIDTH);
+}
- rd_params(IBV_QPT_RC, 0, 1, 1);
- rd_open(&dev, IBV_QPT_RC, NCQE, 0);
- size = Req.rd_atomic * sizeof(uint64_t);
- encode_uint32(&size, size);
- send_mesg(&size, sizeof(size), "Memory region size");
- rd_prep(&dev, size);
- sync_test();
- for (i = 0; i < Req.rd_atomic; ++i) {
- ib_post_compare_swap(&dev, i, i*sizeof(uint64_t), cur, next);
- cur = next;
- next = cur + 1;
- }
- result = (uint64_t *) dev.buffer;
- while (!Finished) {
- struct ibv_wc wc[NCQE];
- int n = ib_poll(&dev, wc, cardof(wc));
- uint64_t res;
- if (Finished)
- break;
- if (n > LStat.max_cqes)
- LStat.max_cqes = n;
- for (i = 0; i < n; ++i) {
- int x = wc[i].wr_id;
- int status = wc[i].status;
- if (status == IBV_WC_SUCCESS) {
- LStat.rem_r.no_bytes += sizeof(uint64_t);
- LStat.rem_r.no_msgs++;
- } else
- do_error(status, &LStat.s.no_errs);
- res = result[x];
- if (last != res)
- error(0, "compare and swap mismatch (expected %llx vs. %llx)",
- (long long)last, (long long)res);
- if (last)
- last++;
- else
- last = 0x0123456789abcdefULL;
- next = cur + 1;
- ib_post_compare_swap(&dev, x, x*sizeof(uint64_t), cur, next);
- cur = next;
- }
- }
- stop_test_timer();
- exchange_results();
- rd_close(&dev);
- show_results(MSG_RATE);
+/*
+ * Measure XRC bi-directional bandwidth (server side).
+ */
+void
+run_server_xrc_bi_bw(void)
+{
+ rd_bi_bw(IBV_QPT_XRC);
}
/*
+ * Measure XRC bandwidth (client side).
+ */
+void
+run_client_xrc_bw(void)
+{
+ par_use(L_ACCESS_RECV);
+ par_use(R_ACCESS_RECV);
+ par_use(L_NO_MSGS);
+ par_use(R_NO_MSGS);
+ rd_params(IBV_QPT_XRC, K64, 1, 0);
+ rd_client_bw(IBV_QPT_XRC);
+ show_results(BANDWIDTH);
+}
+
+
+/*
+ * Measure XRC bandwidth (server side).
+ */
+void
+run_server_xrc_bw(void)
+{
+ rd_server_def(IBV_QPT_XRC);
+}
+
+
+/*
+ * Measure XRC latency (client side).
+ */
+void
+run_client_xrc_lat(void)
+{
+ rd_params(IBV_QPT_XRC, 1, 1, 0);
+ rd_pp_lat(IBV_QPT_XRC, IO_SR);
+}
+
+
+/*
+ * Measure XRC latency (server side).
+ */
+void
+run_server_xrc_lat(void)
+{
+ rd_pp_lat(IBV_QPT_XRC, IO_SR);
+}
+
+/*
+ * Verify RC compare and swap (client side).
+ */
+void
+run_client_ver_rc_compare_swap(void)
+{
+ ib_client_verify_atomic(COMPARE_SWAP);
+}
+
+
+/*
* Verify RC compare and swap (server side).
*/
void
run_server_ver_rc_compare_swap(void)
{
- rd_server_nop(IBV_QPT_RC, VER_RC_COMPARE_SWAP);
+ rd_server_nop(IBV_QPT_RC, sizeof(uint64_t));
}
@@ -820,51 +851,7 @@
void
run_client_ver_rc_fetch_add(void)
{
- int i;
- DEVICE dev;
- uint32_t size;
- uint64_t *result;
- uint64_t last = 0;
-
- rd_params(IBV_QPT_RC, 0, 1, 1);
- rd_open(&dev, IBV_QPT_RC, NCQE, 0);
- size = Req.rd_atomic * sizeof(uint64_t);
- encode_uint32(&size, size);
- send_mesg(&size, sizeof(size), "Memory region size");
- rd_prep(&dev, size);
- sync_test();
- for (i = 0; i < Req.rd_atomic; ++i)
- ib_post_fetch_add(&dev, i, i*sizeof(uint64_t), 1);
- result = (uint64_t *) dev.buffer;
- while (!Finished) {
- struct ibv_wc wc[NCQE];
- int n = ib_poll(&dev, wc, cardof(wc));
- uint64_t res;
-
- if (Finished)
- break;
- if (n > LStat.max_cqes)
- LStat.max_cqes = n;
- for (i = 0; i < n; ++i) {
- int x = wc[i].wr_id;
- int status = wc[i].status;
- if (status == IBV_WC_SUCCESS) {
- LStat.rem_r.no_bytes += sizeof(uint64_t);
- LStat.rem_r.no_msgs++;
- } else
- do_error(status, &LStat.s.no_errs);
- res = result[x];
- if (last != res)
- error(0, "fetch and add mismatch (expected %llx vs. %llx)",
- (long long)last, (long long)res);
- last++;
- ib_post_fetch_add(&dev, x, x*sizeof(uint64_t), 1);
- }
- }
- stop_test_timer();
- exchange_results();
- rd_close(&dev);
- show_results(MSG_RATE);
+ ib_client_verify_atomic(FETCH_ADD);
}
@@ -874,7 +861,7 @@
void
run_server_ver_rc_fetch_add(void)
{
- rd_server_nop(IBV_QPT_RC, VER_RC_FETCH_ADD);
+ rd_server_nop(IBV_QPT_RC, sizeof(uint64_t));
}
@@ -885,18 +872,18 @@
rd_client_bw(int transport)
{
DEVICE dev;
+ long sent = 0;
- long sent = 0;
rd_open(&dev, transport, NCQE, 0);
rd_prep(&dev, 0);
sync_test();
- ib_post_send(&dev, left_to_send(&sent, NCQE));
+ rd_post_send_std(&dev, left_to_send(&sent, NCQE));
sent = NCQE;
while (!Finished) {
int i;
struct ibv_wc wc[NCQE];
+ int n = rd_poll(&dev, wc, cardof(wc));
- int n = ib_poll(&dev, wc, cardof(wc));
if (n > LStat.max_cqes)
LStat.max_cqes = n;
if (Finished)
@@ -904,6 +891,7 @@
for (i = 0; i < n; ++i) {
int id = wc[i].wr_id;
int status = wc[i].status;
+
if (id != WRID_SEND)
debug("bad WR ID %d", id);
else if (status != IBV_WC_SUCCESS)
@@ -914,7 +902,7 @@
break;
n = left_to_send(&sent, n);
}
- ib_post_send(&dev, n);
+ rd_post_send_std(&dev, n);
sent += n;
}
stop_test_timer();
@@ -934,30 +922,32 @@
rd_open(&dev, transport, 0, NCQE);
rd_prep(&dev, 0);
- ib_post_recv(&dev, NCQE);
+ rd_post_recv_std(&dev, NCQE);
sync_test();
while (!Finished) {
int i;
struct ibv_wc wc[NCQE];
- int n = ib_poll(&dev, wc, cardof(wc));
+ int n = rd_poll(&dev, wc, cardof(wc));
+
if (Finished)
break;
if (n > LStat.max_cqes)
LStat.max_cqes = n;
for (i = 0; i < n; ++i) {
int status = wc[i].status;
+
if (status == IBV_WC_SUCCESS) {
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev.msg_size;
LStat.r.no_msgs++;
if (Req.access_recv)
- touch_data(dev.buffer, Req.msg_size);
+ touch_data(dev.buffer, dev.msg_size);
} else
do_error(status, &LStat.r.no_errs);
}
if (Req.no_msgs)
if (LStat.r.no_msgs + LStat.r.no_errs >= Req.no_msgs)
break;
- ib_post_recv(&dev, n);
+ rd_post_recv_std(&dev, n);
}
stop_test_timer();
exchange_results();
@@ -975,15 +965,16 @@
rd_open(&dev, transport, NCQE, NCQE);
rd_prep(&dev, 0);
- ib_post_recv(&dev, NCQE);
+ rd_post_recv_std(&dev, NCQE);
sync_test();
- ib_post_send(&dev, NCQE);
+ rd_post_send_std(&dev, NCQE);
while (!Finished) {
int i;
struct ibv_wc wc[NCQE];
int numSent = 0;
int numRecv = 0;
- int n = ib_poll(&dev, wc, cardof(wc));
+ int n = rd_poll(&dev, wc, cardof(wc));
+
if (Finished)
break;
if (n > LStat.max_cqes)
@@ -991,6 +982,7 @@
for (i = 0; i < n; ++i) {
int id = wc[i].wr_id;
int status = wc[i].status;
+
switch (id) {
case WRID_SEND:
if (status != IBV_WC_SUCCESS)
@@ -999,10 +991,10 @@
break;
case WRID_RECV:
if (status == IBV_WC_SUCCESS) {
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev.msg_size;
LStat.r.no_msgs++;
if (Req.access_recv)
- touch_data(dev.buffer, Req.msg_size);
+ touch_data(dev.buffer, dev.msg_size);
} else
do_error(status, &LStat.r.no_errs);
++numRecv;
@@ -1012,9 +1004,9 @@
}
}
if (numRecv)
- ib_post_recv(&dev, numRecv);
+ rd_post_recv_std(&dev, numRecv);
if (numSent)
- ib_post_send(&dev, numSent);
+ rd_post_send_std(&dev, numSent);
}
stop_test_timer();
exchange_results();
@@ -1048,25 +1040,28 @@
rd_pp_lat_loop(DEVICE *dev, IOMODE iomode)
{
int done = 1;
- ib_post_recv(dev, 1);
+
+ rd_post_recv_std(dev, 1);
sync_test();
if (is_client()) {
if (iomode == IO_SR)
- ib_post_send(dev, 1);
+ rd_post_send_std(dev, 1);
else
- ib_post_rdma(dev, IBV_WR_RDMA_WRITE_WITH_IMM, 1);
+ rd_post_rdma_std(dev, IBV_WR_RDMA_WRITE_WITH_IMM, 1);
done = 0;
}
while (!Finished) {
int i;
struct ibv_wc wc[2];
- int n = ib_poll(dev, wc, cardof(wc));
+ int n = rd_poll(dev, wc, cardof(wc));
+
if (Finished)
break;
for (i = 0; i < n; ++i) {
int id = wc[i].wr_id;
int status = wc[i].status;
+
switch (id) {
case WRID_SEND:
case WRID_RDMA:
@@ -1076,9 +1071,9 @@
continue;
case WRID_RECV:
if (status == IBV_WC_SUCCESS) {
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev->msg_size;
LStat.r.no_msgs++;
- ib_post_recv(dev, 1);
+ rd_post_recv_std(dev, 1);
} else
do_error(status, &LStat.r.no_errs);
done |= 2;
@@ -1091,9 +1086,9 @@
}
if (done == 3) {
if (iomode == IO_SR)
- ib_post_send(dev, 1);
+ rd_post_send_std(dev, 1);
else
- ib_post_rdma(dev, IBV_WR_RDMA_WRITE_WITH_IMM, 1);
+ rd_post_rdma_std(dev, IBV_WR_RDMA_WRITE_WITH_IMM, 1);
done = 0;
}
}
@@ -1102,33 +1097,38 @@
/*
* Loop sending packets back and forth using RDMA Write and polling to measure
- * latency. Note that if we increase the number of entries of wc to be NCQE,
- * on the PS HCA, the latency is much longer.
+ * latency. This is the strategy used by some of the MPIs. Note that it does
+ * not matter what characters clientid and serverid are set to as long as they
+ * are different. Note also that we must set *p and *q before calling
+ * sync_test to avoid a race condition.
*/
static void
rd_rdma_write_poll_lat(int transport)
{
DEVICE dev;
- volatile char *p;
- volatile char *q;
- int send = is_client() ? 1 : 0;
- int locID = send;
- int remID = !locID;
+ volatile unsigned char *p, *q;
+ int send, locid, remid;
+ int clientid = 0x55;
+ int serverid = 0xaa;
+ if (is_client())
+ send = 1, locid = clientid, remid = serverid;
+ else
+ send = 0, locid = serverid, remid = clientid;
rd_open(&dev, transport, NCQE, 0);
rd_prep(&dev, 0);
+ p = (unsigned char *)dev.buffer;
+ q = p + dev.msg_size-1;
+ *p = locid;
+ *q = locid;
sync_test();
- p = &dev.buffer[0];
- q = &dev.buffer[Req.msg_size-1];
while (!Finished) {
- *p = locID;
- *q = locID;
if (send) {
int i;
int n;
struct ibv_wc wc[2];
- ib_post_rdma(&dev, IBV_WR_RDMA_WRITE, 1);
+ rd_post_rdma_std(&dev, IBV_WR_RDMA_WRITE, 1);
if (Finished)
break;
n = ibv_poll_cq(dev.cq, cardof(wc), wc);
@@ -1137,6 +1137,7 @@
for (i = 0; i < n; ++i) {
int id = wc[i].wr_id;
int status = wc[i].status;
+
if (id != WRID_RDMA)
debug("bad WR ID %d", id);
else if (status != IBV_WC_SUCCESS)
@@ -1144,10 +1145,12 @@
}
}
while (!Finished)
- if (*p == remID && *q == remID)
+ if (*p == remid && *q == remid)
break;
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev.msg_size;
LStat.r.no_msgs++;
+ *p = locid;
+ *q = locid;
send = 1;
}
stop_test_timer();
@@ -1167,10 +1170,11 @@
rd_open(&dev, transport, 1, 0);
rd_prep(&dev, 0);
sync_test();
- ib_post_rdma(&dev, IBV_WR_RDMA_READ, 1);
+ rd_post_rdma_std(&dev, IBV_WR_RDMA_READ, 1);
while (!Finished) {
struct ibv_wc wc;
- int n = ib_poll(&dev, &wc, 1);
+ int n = rd_poll(&dev, &wc, 1);
+
if (n == 0)
continue;
if (Finished)
@@ -1180,13 +1184,13 @@
continue;
}
if (wc.status == IBV_WC_SUCCESS) {
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev.msg_size;
LStat.r.no_msgs++;
- LStat.rem_s.no_bytes += Req.msg_size;
+ LStat.rem_s.no_bytes += dev.msg_size;
LStat.rem_s.no_msgs++;
} else
do_error(wc.status, &LStat.s.no_errs);
- ib_post_rdma(&dev, IBV_WR_RDMA_READ, 1);
+ rd_post_rdma_std(&dev, IBV_WR_RDMA_READ, 1);
}
stop_test_timer();
exchange_results();
@@ -1199,37 +1203,39 @@
* Measure RDMA bandwidth (client side).
*/
static void
-rd_client_rdma_bw(int transport, OPCODE opcode)
+rd_client_rdma_bw(int transport, ibv_op opcode)
{
DEVICE dev;
rd_open(&dev, transport, NCQE, 0);
rd_prep(&dev, 0);
sync_test();
- ib_post_rdma(&dev, opcode, NCQE);
+ rd_post_rdma_std(&dev, opcode, NCQE);
while (!Finished) {
int i;
struct ibv_wc wc[NCQE];
- int n = ib_poll(&dev, wc, cardof(wc));
+ int n = rd_poll(&dev, wc, cardof(wc));
+
if (Finished)
break;
if (n > LStat.max_cqes)
LStat.max_cqes = n;
for (i = 0; i < n; ++i) {
int status = wc[i].status;
+
if (status == IBV_WC_SUCCESS) {
if (opcode == IBV_WR_RDMA_READ) {
- LStat.r.no_bytes += Req.msg_size;
+ LStat.r.no_bytes += dev.msg_size;
LStat.r.no_msgs++;
- LStat.rem_s.no_bytes += Req.msg_size;
+ LStat.rem_s.no_bytes += dev.msg_size;
LStat.rem_s.no_msgs++;
if (Req.access_recv)
- touch_data(dev.buffer, Req.msg_size);
+ touch_data(dev.buffer, dev.msg_size);
}
} else
do_error(status, &LStat.s.no_errs);
}
- ib_post_rdma(&dev, opcode, n);
+ rd_post_rdma_std(&dev, opcode, n);
}
stop_test_timer();
exchange_results();
@@ -1241,25 +1247,12 @@
* Server just waits and lets driver take care of any requests.
*/
static void
-rd_server_nop(int transport, ETEST etest)
+rd_server_nop(int transport, int size)
{
DEVICE dev;
- uint32_t size = 0;
/* workaround: Size of RQ should be 0; bug in Mellanox driver */
rd_open(&dev, transport, 0, 1);
-
- /* Compute the size of the memory region */
- if (etest == RC_COMPARE_SWAP_MR || etest == RC_FETCH_ADD_MR)
- size = sizeof(uint64_t);
- else if (etest == VER_RC_COMPARE_SWAP || etest == VER_RC_FETCH_ADD) {
- recv_mesg(&size, sizeof(size), "Memory region size");
- size = decode_uint32(&size);
- } else if (etest == RC_RDMA_READ_BW || etest == RC_RDMA_READ_LAT)
- size = 0;
- else
- error(BUG, "rd_server_nop: bad etest: %d", etest);
-
rd_prep(&dev, size);
sync_test();
while (!Finished)
@@ -1274,7 +1267,7 @@
* Measure messaging rate for an atomic operation.
*/
static void
-ib_client_atomic(ETEST etest)
+ib_client_atomic(ATOMIC atomic)
{
int i;
DEVICE dev;
@@ -1284,33 +1277,29 @@
rd_prep(&dev, sizeof(uint64_t));
sync_test();
- for (i = 0; i < Req.rd_atomic; ++i) {
- if (etest == RC_FETCH_ADD_MR)
- ib_post_fetch_add(&dev, 0, 0, 0);
- else if (etest == RC_COMPARE_SWAP_MR)
- ib_post_compare_swap(&dev, 0, 0, 0, 0);
- else
- error(BUG, "ib_client_atomic: bad etest: %d", etest);
+ for (i = 0; i < NCQE; ++i) {
+ if (Finished)
+ break;
+ ib_post_atomic(&dev, atomic, 0, 0, 0, 0);
}
while (!Finished) {
struct ibv_wc wc[NCQE];
- int n = ib_poll(&dev, wc, cardof(wc));
+ int n = rd_poll(&dev, wc, cardof(wc));
+
if (Finished)
break;
if (n > LStat.max_cqes)
LStat.max_cqes = n;
for (i = 0; i < n; ++i) {
int status = wc[i].status;
+
if (status == IBV_WC_SUCCESS) {
LStat.rem_r.no_bytes += sizeof(uint64_t);
LStat.rem_r.no_msgs++;
} else
do_error(status, &LStat.s.no_errs);
- if (etest == RC_FETCH_ADD_MR)
- ib_post_fetch_add(&dev, 0, 0, 0);
- else
- ib_post_compare_swap(&dev, 0, 0, 0, 0);
+ ib_post_atomic(&dev, atomic, 0, 0, 0, 0);
}
}
@@ -1322,11 +1311,107 @@
/*
+ * Verify RC compare and swap (client side).
+ */
+static void
+ib_client_verify_atomic(ATOMIC atomic)
+{
+ int i;
+ int slots;
+ DEVICE dev;
+ uint64_t args[2];
+ int head = 0;
+ int tail = 0;
+
+ rd_params(IBV_QPT_RC, K64, 1, 1);
+ rd_open(&dev, IBV_QPT_RC, NCQE, 0);
+ slots = Req.msg_size / sizeof(uint64_t);
+ if (slots < 1)
+ error(0, "message size must be at least %d", sizeof(uint64_t));
+ if (slots > NCQE)
+ slots = NCQE;
+ rd_prep(&dev, 0);
+ sync_test();
+
+ for (i = 0; i < slots; ++i) {
+ if (Finished)
+ break;
+ atomic_seq(atomic, head++, 0, args);
+ ib_post_atomic(&dev, atomic, i, i*sizeof(uint64_t), args[0], args[1]);
+ }
+
+ while (!Finished) {
+ struct ibv_wc wc[NCQE];
+ int n = rd_poll(&dev, wc, cardof(wc));
+
+ if (Finished)
+ break;
+ if (n > LStat.max_cqes)
+ LStat.max_cqes = n;
+ for (i = 0; i < n; ++i) {
+ uint64_t want;
+ uint64_t seen;
+ int x = wc[i].wr_id;
+ int status = wc[i].status;
+
+ if (status == IBV_WC_SUCCESS) {
+ LStat.rem_r.no_bytes += sizeof(uint64_t);
+ LStat.rem_r.no_msgs++;
+ } else
+ do_error(status, &LStat.s.no_errs);
+
+ atomic_seq(atomic, tail++, &want, 0);
+ seen = ((uint64_t *)dev.buffer)[x];
+ if (seen != want) {
+ error(0, "mismatch, sequence %d, expected %llx, got %llx",
+ tail, (long long)want, (long long)seen);
+ }
+ atomic_seq(atomic, head++, 0, args);
+ ib_post_atomic(&dev, atomic, x, x*sizeof(uint64_t),
+ args[0], args[1]);
+ }
+ }
+ stop_test_timer();
+ exchange_results();
+ rd_close(&dev);
+ show_results(MSG_RATE);
+}
+
+
+/*
+ * Given an atomic operation and an index, return the next value associated
+ * with that index and the arguments we might pass to post that atomic.
+ */
+static void
+atomic_seq(ATOMIC atomic, int i, uint64_t *value, uint64_t *args)
+{
+ if (atomic == COMPARE_SWAP) {
+ uint64_t v;
+ uint64_t magic = 0x0123456789abcdefULL;
+
+ v = i ? magic + i-1 : 0;
+ if (value)
+ *value = v;
+ if (args) {
+ args[0] = v;
+ args[1] = magic + i;
+ }
+ } else if (atomic == FETCH_ADD) {
+ if (value)
+ *value = i;
+ if (args)
+ args[0] = 1;
+ }
+}
+
+
+/*
* Set default parameters.
*/
static void
-rd_params(int transport, long msgSize, int poll, int atomic)
+rd_params(int transport, long msg_size, int poll, int atomic)
{
+ //if (transport == IBV_QPT_RC || transport == IBV_QPT_UD) {
if (transport == IBV_QPT_RC) {
par_use(L_USE_CM);
par_use(R_USE_CM);
@@ -1344,11 +1429,13 @@
par_use(R_SL);
par_use(L_STATIC_RATE);
par_use(R_STATIC_RATE);
+ par_use(L_SRC_PATH_BITS);
+ par_use(R_SRC_PATH_BITS);
}
- if (msgSize) {
- setp_u32(0, L_MSG_SIZE, msgSize);
- setp_u32(0, R_MSG_SIZE, msgSize);
+ if (msg_size) {
+ setp_u32(0, L_MSG_SIZE, msg_size);
+ setp_u32(0, R_MSG_SIZE, msg_size);
}
if (poll) {
@@ -1368,7 +1455,7 @@
* Open a RDMA device.
*/
static void
-rd_open(DEVICE *dev, int trans, int maxSendWR, int maxRecvWR)
+rd_open(DEVICE *dev, int trans, int max_send_wr, int max_recv_wr)
{
/* Send request to client */
if (is_client())
@@ -1379,8 +1466,8 @@
/* Set transport type and maximum work request parameters */
dev->trans = trans;
- dev->maxSendWR = maxSendWR;
- dev->maxRecvWR = maxRecvWR;
+ dev->max_send_wr = max_send_wr;
+ dev->max_recv_wr = max_recv_wr;
/* Open device */
if (Req.use_cm)
@@ -1395,7 +1482,7 @@
if (ibv_query_qp(dev->qp, &qp_attr, 0, &qp_init_attr) != 0)
error(SYS, "query QP failed");
- dev->maxInline = qp_attr.cap.max_inline_data;
+ dev->max_inline = qp_attr.cap.max_inline_data;
}
}
@@ -1406,28 +1493,16 @@
static void
rd_prep(DEVICE *dev, int size)
{
+ /* Set the size of the messages we transfer */
+ if (size == 0)
+ dev->msg_size = Req.msg_size;
+
/* Allocate memory region */
if (size == 0)
- size = Req.msg_size;
+ size = dev->msg_size;
if (dev->trans == IBV_QPT_UD)
size += GRH_SIZE;
- if (size == 0)
- size = 1;
- if (size) {
- int pagesize = sysconf(_SC_PAGESIZE);
- if (posix_memalign((void **)&dev->buffer, pagesize, size) != 0)
- error(SYS, "failed to allocate memory");
- memset(dev->buffer, 0, size);
- int flags = IBV_ACCESS_LOCAL_WRITE |
- IBV_ACCESS_REMOTE_READ |
- IBV_ACCESS_REMOTE_WRITE |
- IBV_ACCESS_REMOTE_ATOMIC;
- dev->mr = ibv_reg_mr(dev->pd, dev->buffer, size, flags);
- if (!dev->mr)
- error(SYS, "failed to allocate memory region");
- dev->lnode.rkey = dev->mr->rkey;
- dev->lnode.vaddr = (unsigned long)dev->buffer;
- }
+ rd_mralloc(dev, size);
/* Exchange node information */
{
@@ -1469,44 +1544,57 @@
if (!Debug)
return;
n = &dev->lnode;
+
if (Req.use_cm)
debug("L: rkey=%08x vaddr=%010x", n->rkey, n->vaddr);
- else
+ else if (dev->trans == IBV_QPT_XRC) {
+ debug("L: lid=%04x qpn=%06x psn=%06x rkey=%08x vaddr=%010x srqn=%08x",
+ n->lid, n->qpn, n->psn, n->rkey, n->vaddr, n->srqn);
+ } else {
debug("L: lid=%04x qpn=%06x psn=%06x rkey=%08x vaddr=%010x",
- n->lid, n->qpn, n->psn, n->rkey, n->vaddr);
+ n->lid, n->qpn, n->psn, n->rkey, n->vaddr);
+ }
+
n = &dev->rnode;
if (Req.use_cm)
debug("R: rkey=%08x vaddr=%010x", n->rkey, n->vaddr);
- else
+ else if (dev->trans == IBV_QPT_XRC) {
+ debug("R: lid=%04x qpn=%06x psn=%06x rkey=%08x vaddr=%010x srqn=%08x",
+ n->lid, n->qpn, n->psn, n->rkey, n->vaddr);
+ } else {
debug("R: lid=%04x qpn=%06x psn=%06x rkey=%08x vaddr=%010x",
- n->lid, n->qpn, n->psn, n->rkey, n->vaddr);
+ n->lid, n->qpn, n->psn, n->rkey, n->vaddr, n->srqn);
+ }
}
/*
* Close a RDMA device. We must destroy the CQ before the QP otherwise the
- * ibv_destroy_qp call seems to hang sometimes.
+ * ibv_destroy_qp call seems to sometimes hang. We must also destroy the QP
+ * before destroying the memory region as we cannot destroy the memory region
+ * if there are references still outstanding. Hopefully we now have things in
+ * the right order.
*/
static void
rd_close(DEVICE *dev)
{
+ if (Req.use_cm)
+ cm_close(dev);
+ else
+ ib_close1(dev);
+
if (dev->ah)
ibv_destroy_ah(dev->ah);
if (dev->cq)
ibv_destroy_cq(dev->cq);
- if (dev->mr)
- ibv_dereg_mr(dev->mr);
if (dev->pd)
ibv_dealloc_pd(dev->pd);
if (dev->channel)
ibv_destroy_comp_channel(dev->channel);
- if (dev->buffer)
- free(dev->buffer);
+ rd_mrfree(dev);
- if (Req.use_cm)
- cm_close(dev);
- else
- ib_close(dev);
+ if (!Req.use_cm)
+ ib_close2(dev);
memset(dev, 0, sizeof(*dev));
}
@@ -1522,12 +1610,14 @@
{
struct ibv_device_attr dev_attr;
- if (ibv_query_device(context, &dev_attr) != 0)
+ if (ibv_query_device(context, &dev_attr) != SUCCESS0)
error(SYS, "query device failed");
if (Req.rd_atomic == 0)
- Req.rd_atomic = dev_attr.max_qp_rd_atom;
- else if (Req.rd_atomic > dev_attr.max_qp_rd_atom)
- error(0, "device only supports %d (< %d) RDMA reads or atomic ops",
+ dev->lnode.rd_atomic = dev_attr.max_qp_rd_atom;
+ else if (Req.rd_atomic <= dev_attr.max_qp_rd_atom)
+ dev->lnode.rd_atomic = Req.rd_atomic;
+ else
+ error(0, "device only supports %d (< %d) RDMA reads or atomics",
dev_attr.max_qp_rd_atom, Req.rd_atomic);
}
@@ -1543,31 +1633,52 @@
/* Create completion queue */
dev->cq = ibv_create_cq(context,
- dev->maxSendWR+dev->maxRecvWR, 0, dev->channel, 0);
+ dev->max_send_wr+dev->max_recv_wr, 0, dev->channel, 0);
if (!dev->cq)
error(SYS, "failed to create completion queue");
/* Create queue pair */
{
- struct ibv_qp_init_attr attr ={
+ struct ibv_qp_init_attr qp_attr ={
.send_cq = dev->cq,
.recv_cq = dev->cq,
.cap ={
- .max_send_wr = dev->maxSendWR,
- .max_recv_wr = dev->maxRecvWR,
+ .max_send_wr = dev->max_send_wr,
+ .max_recv_wr = dev->max_recv_wr,
.max_send_sge = 1,
.max_recv_sge = 1,
- .max_inline_data = 0
},
.qp_type = dev->trans
};
if (Req.use_cm) {
- if (rdma_create_qp(id, dev->pd, &attr) != 0)
- error(0, "failed to create QP");
+ if (rdma_create_qp(id, dev->pd, &qp_attr) != 0)
+ error(SYS, "failed to create QP");
dev->qp = id->qp;
} else {
- dev->qp = ibv_create_qp(dev->pd, &attr);
+ if (dev->trans == IBV_QPT_XRC) {
+ struct ibv_srq_init_attr srq_attr ={
+ .attr ={
+ .max_wr = dev->max_recv_wr,
+ .max_sge = 1
+ }
+ };
+
+ dev->xrc = ibv_open_xrc_domain(context, -1, O_CREAT);
+ if (!dev->xrc)
+ error(SYS, "failed to open XRC domain");
+
+ dev->srq = ibv_create_xrc_srq(dev->pd, dev->xrc, dev->cq,
+ &srq_attr);
+ if (!dev->srq)
+ error(SYS, "failed to create SRQ");
+
+ qp_attr.cap.max_recv_wr = 0;
+ qp_attr.cap.max_recv_sge = 0;
+ qp_attr.xrc_domain = dev->xrc;
+ }
+
+ dev->qp = ibv_create_qp(dev->pd, &qp_attr);
if (!dev->qp)
error(SYS, "failed to create QP");
}
@@ -1576,6 +1687,60 @@
/*
+ * Allocate a memory region and register it. I thought this routine should
+ * never be called with a size of 0 as prior code checks for that and sets it
+ * to some default value. I appear to be wrong. In that case, size is set to
+ * 1 so other code does not break.
+ */
+static void
+rd_mralloc(DEVICE *dev, int size)
+{
+ int flags;
+ int pagesize;
+
+ if (dev->buffer)
+ error(BUG, "rd_mralloc: memory region already allocated");
+ if (size == 0)
+ size = 1;
+
+ pagesize = sysconf(_SC_PAGESIZE);
+ if (posix_memalign((void **)&dev->buffer, pagesize, size) != 0)
+ error(SYS, "failed to allocate memory");
+ memset(dev->buffer, 0, size);
+ dev->buf_size = size;
+ flags = IBV_ACCESS_LOCAL_WRITE |
+ IBV_ACCESS_REMOTE_READ |
+ IBV_ACCESS_REMOTE_WRITE |
+ IBV_ACCESS_REMOTE_ATOMIC;
+ dev->mr = ibv_reg_mr(dev->pd, dev->buffer, size, flags);
+ if (!dev->mr)
+ error(SYS, "failed to allocate memory region");
+ dev->lnode.rkey = dev->mr->rkey;
+ dev->lnode.vaddr = (unsigned long)dev->buffer;
+}
+
+
+/*
+ * Free the memory region.
+ */
+static void
+rd_mrfree(DEVICE *dev)
+{
+ if (dev->mr)
+ ibv_dereg_mr(dev->mr);
+ dev->mr = NULL;
+
+ if (dev->buffer)
+ free(dev->buffer);
+ dev->buffer = NULL;
+ dev->buf_size = 0;
+
+ dev->lnode.rkey = 0;
+ dev->lnode.vaddr = 0;
+}
+
+
+/*
* Open a device using the Connection Manager.
*/
static void
@@ -1597,11 +1762,12 @@
cm_init(DEVICE *dev)
{
CMINFO *cm = &dev->cm;
+ int portspace = (dev->trans == IBV_QPT_RC) ? RDMA_PS_TCP : RDMA_PS_UDP;
cm->channel = rdma_create_event_channel();
if (!cm->channel)
error(0, "rdma_create_event_channel failed");
- if (rdma_create_id(cm->channel, &cm->id, 0, RDMA_PS_TCP) != 0)
+ if (rdma_create_id(cm->channel, &cm->id, 0, portspace) != 0)
error(0, "rdma_create_id failed");
}
@@ -1618,12 +1784,6 @@
.ai_family = AF_INET,
.ai_socktype = SOCK_STREAM
};
- struct rdma_conn_param param ={
- .responder_resources = 1,
- .initiator_depth = 1,
- .rnr_retry_count = RNR_RETRY_CNT,
- .retry_count = RETRY_CNT
- };
int timeout = Req.timeout * 1000;
CMINFO *cm = &dev->cm;
@@ -1642,12 +1802,35 @@
error(0, "rdma_resolve_route failed");
cm_expect_event(dev, RDMA_CM_EVENT_ROUTE_RESOLVED);
cm_ack_event(dev);
+ rd_create_qp(dev, cm->id->verbs, cm->id);
- rd_create_qp(dev, cm->id->verbs, cm->id);
- if (rdma_connect(cm->id, ¶m) != 0)
- error(0, "rdma_connect failed");
- cm_expect_event(dev, RDMA_CM_EVENT_ESTABLISHED);
- cm_ack_event(dev);
+ if (dev->trans == IBV_QPT_RC) {
+ struct rdma_conn_param param ={
+ .responder_resources = 1,
+ .initiator_depth = 1,
+ .rnr_retry_count = RNR_RETRY_CNT,
+ .retry_count = RETRY_CNT
+ };
+
+ if (rdma_connect(cm->id, ¶m) != 0)
+ error(0, "rdma_connect failed");
+ cm_expect_event(dev, RDMA_CM_EVENT_ESTABLISHED);
+ cm_ack_event(dev);
+ } else if (dev->trans == IBV_QPT_UD) {
+ struct rdma_conn_param param ={
+ .qp_num = cm->id->qp->qp_num
+ };
+
+ if (rdma_connect(cm->id, ¶m) != 0)
+ error(0, "rdma_connect failed");
+ cm_expect_event(dev, RDMA_CM_EVENT_ESTABLISHED);
+ dev->qkey = cm->event->param.ud.qkey;
+ dev->ah = ibv_create_ah(dev->pd, &cm->event->param.ud.ah_attr);
+ if (!dev->ah)
+ error(SYS, "failed to create address handle");
+ cm_ack_event(dev);
+ } else
+ error(BUG, "cm_open_client: bad transport: %d", dev->trans);
}
@@ -1663,12 +1846,6 @@
.sin_addr.s_addr = htonl(INADDR_ANY),
.sin_port = htons(0)
};
- struct rdma_conn_param param ={
- .responder_resources = 1,
- .initiator_depth = 1,
- .rnr_retry_count = RNR_RETRY_CNT,
- .retry_count = RETRY_CNT
- };
CMINFO *cm = &dev->cm;
if (rdma_bind_addr(cm->id, (SA *)&saddr) != 0)
@@ -1682,11 +1859,39 @@
cm_expect_event(dev, RDMA_CM_EVENT_CONNECT_REQUEST);
rd_create_qp(dev, cm->event->id->verbs, cm->event->id);
- if (rdma_accept(cm->event->id, ¶m) != 0)
- error(0, "rdma_accept failed");
- cm_ack_event(dev);
- cm_expect_event(dev, RDMA_CM_EVENT_ESTABLISHED);
- cm_ack_event(dev);
+ if (dev->trans == IBV_QPT_RC) {
+ struct rdma_conn_param param ={
+ .responder_resources = 1,
+ .initiator_depth = 1,
+ .rnr_retry_count = RNR_RETRY_CNT,
+ .retry_count = RETRY_CNT
+ };
+ struct ibv_qp_attr rtr_attr ={
+ .min_rnr_timer = MIN_RNR_TIMER,
+ };
+
+ if (rdma_accept(cm->event->id, ¶m) != 0)
+ error(0, "rdma_accept failed");
+ cm_ack_event(dev);
+ cm_expect_event(dev, RDMA_CM_EVENT_ESTABLISHED);
+ cm_ack_event(dev);
+
+ /* Do not complain on error as we might be on a iWARP device */
+ ibv_modify_qp(dev->qp, &rtr_attr, IBV_QP_MIN_RNR_TIMER);
+ } else if (dev->trans == IBV_QPT_UD) {
+ struct rdma_conn_param param ={
+ .qp_num = cm->event->id->qp->qp_num
+ };
+
+ if (rdma_accept(cm->event->id, ¶m) != 0)
+ error(0, "rdma_accept failed");
+ dev->qkey = cm->event->param.ud.qkey;
+ dev->ah = ibv_create_ah(dev->pd, &cm->event->param.ud.ah_attr);
+ if (!dev->ah)
+ error(SYS, "failed to create address handle");
+ cm_ack_event(dev);
+ } else
+ error(BUG, "cm_open_server: bad transport: %d", dev->trans);
}
@@ -1696,13 +1901,6 @@
static void
cm_prep(DEVICE *dev)
{
- struct ibv_qp_attr rtr_attr ={
- .min_rnr_timer = MIN_RNR_TIMER,
- };
-
- /* Do not complain if error as we might be on a iWARP device */
- if (dev->trans == IBV_QPT_RC)
- ibv_modify_qp(dev->qp, &rtr_attr, IBV_QP_MIN_RNR_TIMER);
}
@@ -1723,7 +1921,7 @@
/*
- * Get an event from the Communication Manager. If it is not what we expect,
+ * Get an event from the Connection Manager. If it is not what we expect,
* complain.
*/
static void
@@ -1744,7 +1942,8 @@
/*
- * Return a name given a RDMA CM event number.
+ * Return a name given a RDMA CM event number. We first look at our list. If
+ * that fails, we call the standard rdma_event_str routine.
*/
static char *
cm_event_name(int event, char *data, int size)
@@ -1754,7 +1953,8 @@
for (i = 0; i < cardof(CMEvents); ++i)
if (event == CMEvents[i].value)
return CMEvents[i].name;
- snprintf(data, size, "%d", event);
+ strncpy(data, rdma_event_str(event), size);
+ data[size-1] = '\0';
return data;
}
@@ -1779,6 +1979,7 @@
/* Determine MTU */
{
int mtu = Req.mtu_size;
+
if (mtu == 256)
dev->ib.mtu = IBV_MTU_256;
else if (mtu == 512)
@@ -1797,6 +1998,7 @@
{
int port = 1;
char *p = index(Req.id, ':');
+
if (p) {
*p++ = '\0';
port = atoi(p);
@@ -1821,6 +2023,9 @@
}
}
+ /* Set up Q Key */
+ dev->qkey = QKEY;
+
/* Open device */
{
struct ibv_device *device;
@@ -1849,12 +2054,14 @@
/* Set up local node LID */
{
struct ibv_port_attr port_attr;
+ int stat = ibv_query_port(dev->ib.context, dev->ib.port, &port_attr);
- int stat = ibv_query_port(dev->ib.context, dev->ib.port, &port_attr);
if (stat != 0)
error(SYS, "query port failed");
srand48(getpid()*time(0));
dev->lnode.lid = port_attr.lid;
+ if (port_attr.lmc > 0)
+ dev->lnode.lid += Req.src_path_bits & ((1 << port_attr.lmc) - 1);
}
/* Create QP */
@@ -1871,8 +2078,8 @@
if (dev->trans == IBV_QPT_UD) {
flags |= IBV_QP_QKEY;
- attr.qkey = QKEY;
- } else if (dev->trans == IBV_QPT_RC) {
+ attr.qkey = dev->qkey;
+ } else if (dev->trans == IBV_QPT_RC || dev->trans == IBV_QPT_XRC) {
flags |= IBV_QP_ACCESS_FLAGS;
attr.qp_access_flags =
IBV_ACCESS_REMOTE_READ |
@@ -1882,13 +2089,28 @@
flags |= IBV_QP_ACCESS_FLAGS;
attr.qp_access_flags = IBV_ACCESS_REMOTE_WRITE;
}
- if (ibv_modify_qp(dev->qp, &attr, flags) != 0)
+ if (ibv_modify_qp(dev->qp, &attr, flags) != SUCCESS0)
error(SYS, "failed to modify QP to INIT state");
}
- /* Set up local node QP number and PSN */
+ /* Set up local node QP number, PSN and SRQ number */
dev->lnode.qpn = dev->qp->qp_num;
dev->lnode.psn = lrand48() & 0xffffff;
+ if (dev->trans == IBV_QPT_XRC)
+ dev->lnode.srqn = dev->srq->xrc_srq_num;
+
+ /* Set up alternate port LID */
+ if (Req.alt_port) {
+ struct ibv_port_attr port_attr;
+ int stat = ibv_query_port(dev->ib.context, Req.alt_port, &port_attr);
+
+ if (stat != SUCCESS0)
+ error(SYS, "query port failed");
+ dev->lnode.alt_lid = port_attr.lid;
+ if (port_attr.lmc > 0)
+ dev->lnode.alt_lid +=
+ Req.src_path_bits & ((1 << port_attr.lmc) - 1);
+ }
}
@@ -1906,27 +2128,38 @@
.dest_qp_num = dev->rnode.qpn,
.rq_psn = dev->rnode.psn,
.min_rnr_timer = MIN_RNR_TIMER,
- .max_dest_rd_atomic = Req.rd_atomic,
+ .max_dest_rd_atomic = dev->lnode.rd_atomic,
.ah_attr = {
.dlid = dev->rnode.lid,
.port_num = dev->ib.port,
.static_rate = dev->ib.rate,
+ .src_path_bits = Req.src_path_bits,
.sl = Req.sl
}
};
struct ibv_qp_attr rts_attr ={
- .qp_state = IBV_QPS_RTS,
- .timeout = LOCAL_ACK_TIMEOUT,
- .retry_cnt = RETRY_CNT,
- .rnr_retry = RNR_RETRY_CNT,
- .sq_psn = dev->lnode.psn,
- .max_rd_atomic = Req.rd_atomic
+ .qp_state = IBV_QPS_RTS,
+ .timeout = LOCAL_ACK_TIMEOUT,
+ .retry_cnt = RETRY_CNT,
+ .rnr_retry = RNR_RETRY_CNT,
+ .sq_psn = dev->lnode.psn,
+ .max_rd_atomic = dev->rnode.rd_atomic,
+ .path_mig_state = IBV_MIG_REARM,
+ .alt_port_num = Req.alt_port,
+ .alt_ah_attr = {
+ .dlid = dev->rnode.alt_lid,
+ .port_num = Req.alt_port,
+ .static_rate = dev->ib.rate,
+ .src_path_bits = Req.src_path_bits,
+ .sl = Req.sl
+ }
};
struct ibv_ah_attr ah_attr ={
- .dlid = dev->rnode.lid,
- .port_num = dev->ib.port,
- .static_rate = dev->ib.rate,
- .sl = Req.sl
+ .dlid = dev->rnode.lid,
+ .port_num = dev->ib.port,
+ .static_rate = dev->ib.rate,
+ .src_path_bits = Req.src_path_bits,
+ .sl = Req.sl
};
if (dev->trans == IBV_QPT_UD) {
@@ -1944,7 +2177,7 @@
dev->ah = ibv_create_ah(dev->pd, &ah_attr);
if (!dev->ah)
error(SYS, "failed to create address handle");
- } else if (dev->trans == IBV_QPT_RC) {
+ } else if (dev->trans == IBV_QPT_RC || dev->trans == IBV_QPT_XRC) {
/* Modify queue pair to RTR */
flags = IBV_QP_STATE |
IBV_QP_AV |
@@ -1963,6 +2196,8 @@
IBV_QP_RNR_RETRY |
IBV_QP_SQ_PSN |
IBV_QP_MAX_QP_RD_ATOMIC;
+ if (dev->trans == IBV_QPT_RC && dev->rnode.alt_lid)
+ flags |= IBV_QP_ALT_PATH | IBV_QP_PATH_MIG_STATE;
if (ibv_modify_qp(dev->qp, &rts_attr, flags) != 0)
error(SYS, "failed to modify QP to RTS");
} else if (dev->trans == IBV_QPT_UC) {
@@ -1978,6 +2213,8 @@
/* Modify queue pair to RTS */
flags = IBV_QP_STATE |
IBV_QP_SQ_PSN;
+ if (dev->rnode.alt_lid)
+ flags |= IBV_QP_ALT_PATH | IBV_QP_PATH_MIG_STATE;
if (ibv_modify_qp(dev->qp, &rts_attr, flags) != 0)
error(SYS, "failed to modify QP to RTS");
}
@@ -1985,13 +2222,26 @@
/*
- * Close an InfiniBand device.
+ * Close an InfiniBand device, part 1.
*/
static void
-ib_close(DEVICE *dev)
+ib_close1(DEVICE *dev)
{
if (dev->qp)
ibv_destroy_qp(dev->qp);
+ if (dev->srq)
+ ibv_destroy_srq(dev->srq);
+ if (dev->xrc)
+ ibv_close_xrc_domain(dev->xrc);
+}
+
+
+/*
+ * Close an InfiniBand device, part 2.
+ */
+static void
+ib_close2(DEVICE *dev)
+{
if (dev->ib.context)
ibv_close_device(dev->ib.context);
if (dev->ib.devlist)
@@ -2000,54 +2250,38 @@
/*
- * Post a compare and swap request.
+ * Cause a path migration to happen.
*/
static void
-ib_post_compare_swap(DEVICE *dev,
- int wrid, int offset, uint64_t compare, uint64_t swap)
+ib_migrate(DEVICE *dev)
{
- struct ibv_sge sge ={
- .addr = (uintptr_t)dev->buffer + offset,
- .length = sizeof(uint64_t),
- .lkey = dev->mr->lkey
- };
- struct ibv_send_wr wr ={
- .wr_id = wrid,
- .sg_list = &sge,
- .num_sge = 1,
- .opcode = IBV_WR_ATOMIC_CMP_AND_SWP,
- .send_flags = IBV_SEND_SIGNALED,
- .wr = {
- .atomic = {
- .remote_addr = dev->rnode.vaddr,
- .rkey = dev->rnode.rkey,
- .compare_add = compare,
- .swap = swap
- }
- }
- };
- struct ibv_send_wr *badWR;
+ if (!Req.alt_port)
+ return;
+ /* Only migrate once. */
+ Req.alt_port = 0;
+ if (dev->trans != IBV_QPT_RC && dev->trans != IBV_QPT_UC)
+ return;
- errno = 0;
- if (ibv_post_send(dev->qp, &wr, &badWR) != SUCCESS0) {
- if (Finished && errno == EINTR)
- return;
- error(SYS, "failed to post compare and swap");
+ {
+ struct ibv_qp_attr attr ={
+ .path_mig_state = IBV_MIG_MIGRATED,
+ };
+
+ if (ibv_modify_qp(dev->qp, &attr, IBV_QP_PATH_MIG_STATE) != SUCCESS0)
+ error(SYS, "failed to modify QP to Migrated state");
}
-
- LStat.s.no_bytes += sizeof(uint64_t);
- LStat.s.no_msgs++;
}
/*
- * Post a fetch and add request.
+ * Post an atomic.
*/
static void
-ib_post_fetch_add(DEVICE *dev, int wrid, int offset, uint64_t add)
+ib_post_atomic(DEVICE *dev, ATOMIC atomic, int wrid,
+ int offset, uint64_t compare_add, uint64_t swap)
{
struct ibv_sge sge ={
- .addr = (uintptr_t) dev->buffer + offset,
+ .addr = (uintptr_t)dev->buffer + offset,
.length = sizeof(uint64_t),
.lkey = dev->mr->lkey
};
@@ -2055,23 +2289,35 @@
.wr_id = wrid,
.sg_list = &sge,
.num_sge = 1,
- .opcode = IBV_WR_ATOMIC_FETCH_AND_ADD,
.send_flags = IBV_SEND_SIGNALED,
.wr = {
.atomic = {
- .remote_addr = dev->rnode.vaddr,
- .rkey = dev->rnode.rkey,
- .compare_add = add
+ .remote_addr = dev->rnode.vaddr,
+ .rkey = dev->rnode.rkey,
}
}
};
- struct ibv_send_wr *badWR;
+ struct ibv_send_wr *badwr;
+ if (atomic == COMPARE_SWAP) {
+ wr.opcode = IBV_WR_ATOMIC_CMP_AND_SWP;
+ wr.wr.atomic.compare_add = compare_add;
+ wr.wr.atomic.swap = swap;
+ } else if (atomic == FETCH_ADD) {
+ wr.opcode = IBV_WR_ATOMIC_FETCH_AND_ADD;
+ wr.wr.atomic.compare_add = compare_add;
+ }
+
errno = 0;
- if (ibv_post_send(dev->qp, &wr, &badWR) != SUCCESS0) {
+ if (ibv_post_send(dev->qp, &wr, &badwr) != SUCCESS0) {
if (Finished && errno == EINTR)
return;
- error(SYS, "failed to post fetch and add");
+ if (atomic == COMPARE_SWAP)
+ error(SYS, "failed to post compare and swap");
+ else if (atomic == FETCH_ADD)
+ error(SYS, "failed to post fetch and add");
+ else
+ error(BUG, "bad atomic: %d", atomic);
}
LStat.s.no_bytes += sizeof(uint64_t);
@@ -2080,14 +2326,25 @@
/*
+ * The standard version to post sends that most of the test routines call.
* Post n sends.
*/
static void
-ib_post_send(DEVICE *dev, int n)
+rd_post_send_std(DEVICE *dev, int n)
{
+ rd_post_send(dev, 0, dev->msg_size, 0, n, 1);
+}
+
+
+/*
+ * Post one or more sends.
+ */
+static void
+rd_post_send(DEVICE *dev, int off, int len, int inc, int rep, int stat)
+{
struct ibv_sge sge ={
- .addr = (uintptr_t) dev->buffer,
- .length = Req.msg_size,
+ .addr = (uintptr_t) &dev->buffer[off],
+ .length = len,
.lkey = dev->mr->lkey
};
struct ibv_send_wr wr ={
@@ -2097,24 +2354,31 @@
.opcode = IBV_WR_SEND,
.send_flags = IBV_SEND_SIGNALED,
};
- struct ibv_send_wr *badWR;
+ struct ibv_send_wr *badwr;
if (dev->trans == IBV_QPT_UD) {
wr.wr.ud.ah = dev->ah;
wr.wr.ud.remote_qpn = dev->rnode.qpn;
- wr.wr.ud.remote_qkey = QKEY;
- }
- if (Req.msg_size <= dev->maxInline)
+ wr.wr.ud.remote_qkey = dev->qkey;
+ } else if (dev->trans == IBV_QPT_XRC)
+ wr.xrc_remote_srq_num = dev->rnode.srqn;
+
+ if (dev->msg_size <= dev->max_inline)
wr.send_flags |= IBV_SEND_INLINE;
+
errno = 0;
- while (n-- > 0) {
- if (ibv_post_send(dev->qp, &wr, &badWR) != SUCCESS0) {
+ while (!Finished && rep-- > 0) {
+ if (ibv_post_send(dev->qp, &wr, &badwr) != SUCCESS0) {
if (Finished && errno == EINTR)
return;
error(SYS, "failed to post send");
}
- LStat.s.no_bytes += Req.msg_size;
- LStat.s.no_msgs++;
+ sge.addr += inc;
+ sge.length += inc;
+ if (stat) {
+ LStat.s.no_bytes += dev->msg_size;
+ LStat.s.no_msgs++;
+ }
}
}
@@ -2123,11 +2387,11 @@
* Post n receives.
*/
static void
-ib_post_recv(DEVICE *dev, int n)
+rd_post_recv_std(DEVICE *dev, int n)
{
struct ibv_sge sge ={
.addr = (uintptr_t) dev->buffer,
- .length = Req.msg_size,
+ .length = dev->buf_size,
.lkey = dev->mr->lkey
};
struct ibv_recv_wr wr ={
@@ -2135,14 +2399,18 @@
.sg_list = &sge,
.num_sge = 1,
};
- struct ibv_recv_wr *badWR;
+ struct ibv_recv_wr *badwr;
- if (dev->trans == IBV_QPT_UD)
- sge.length += GRH_SIZE;
+ errno = 0;
+ while (!Finished && n-- > 0) {
+ int stat;
- errno = 0;
- while (n-- > 0) {
- if (ibv_post_recv(dev->qp, &wr, &badWR) != SUCCESS0) {
+ if (dev->srq)
+ stat = ibv_post_srq_recv(dev->srq, &wr, &badwr);
+ else
+ stat = ibv_post_recv(dev->qp, &wr, &badwr);
+
+ if (stat != SUCCESS0) {
if (Finished && errno == EINTR)
return;
error(SYS, "failed to post receive");
@@ -2155,11 +2423,11 @@
* Post n RDMA requests.
*/
static void
-ib_post_rdma(DEVICE *dev, OPCODE opcode, int n)
+rd_post_rdma_std(DEVICE *dev, ibv_op opcode, int n)
{
struct ibv_sge sge ={
.addr = (uintptr_t) dev->buffer,
- .length = Req.msg_size,
+ .length = dev->msg_size,
.lkey = dev->mr->lkey
};
struct ibv_send_wr wr ={
@@ -2175,19 +2443,19 @@
}
}
};
- struct ibv_send_wr *badWR;
+ struct ibv_send_wr *badwr;
- if (opcode != IBV_WR_RDMA_READ && Req.msg_size <= dev->maxInline)
+ if (opcode != IBV_WR_RDMA_READ && dev->msg_size <= dev->max_inline)
wr.send_flags |= IBV_SEND_INLINE;
errno = 0;
- while (n--) {
- if (ibv_post_send(dev->qp, &wr, &badWR) != SUCCESS0) {
+ while (!Finished && n--) {
+ if (ibv_post_send(dev->qp, &wr, &badwr) != SUCCESS0) {
if (Finished && errno == EINTR)
return;
error(SYS, "failed to post %s", opcode_name(wr.opcode));
}
if (opcode != IBV_WR_RDMA_READ) {
- LStat.s.no_bytes += Req.msg_size;
+ LStat.s.no_bytes += dev->msg_size;
LStat.s.no_msgs++;
}
}
@@ -2198,7 +2466,7 @@
* Poll the completion queue.
*/
static int
-ib_poll(DEVICE *dev, struct ibv_wc *wc, int nwc)
+rd_poll(DEVICE *dev, struct ibv_wc *wc, int nwc)
{
int n;
@@ -2212,6 +2480,7 @@
error(0, "CQ event for unknown CQ");
if (ibv_req_notify_cq(dev->cq, 0) != SUCCESS0)
return maybe(0, "failed to request CQ notification");
+ ibv_ack_cq_events(dev->cq, 1);
}
n = ibv_poll_cq(dev->cq, nwc, wc);
if (n < 0)
@@ -2242,11 +2511,14 @@
static void
enc_node(NODE *host)
{
- enc_int(host->lid, sizeof(host->lid));
- enc_int(host->qpn, sizeof(host->qpn));
- enc_int(host->psn, sizeof(host->psn));
- enc_int(host->rkey, sizeof(host->rkey));
- enc_int(host->vaddr, sizeof(host->vaddr));
+ enc_int(host->vaddr, sizeof(host->vaddr));
+ enc_int(host->lid, sizeof(host->lid));
+ enc_int(host->qpn, sizeof(host->qpn));
+ enc_int(host->psn, sizeof(host->psn));
+ enc_int(host->srqn, sizeof(host->srqn));
+ enc_int(host->rkey, sizeof(host->rkey));
+ enc_int(host->alt_lid, sizeof(host->alt_lid));
+ enc_int(host->rd_atomic, sizeof(host->rd_atomic));
}
@@ -2256,11 +2528,14 @@
static void
dec_node(NODE *host)
{
- host->lid = dec_int(sizeof(host->lid));
- host->qpn = dec_int(sizeof(host->qpn));
- host->psn = dec_int(sizeof(host->psn));
- host->rkey = dec_int(sizeof(host->rkey));
- host->vaddr = dec_int(sizeof(host->vaddr));
+ host->vaddr = dec_int(sizeof(host->vaddr));
+ host->lid = dec_int(sizeof(host->lid));
+ host->qpn = dec_int(sizeof(host->qpn));
+ host->psn = dec_int(sizeof(host->psn));
+ host->srqn = dec_int(sizeof(host->srqn));
+ host->rkey = dec_int(sizeof(host->rkey));
+ host->alt_lid = dec_int(sizeof(host->alt_lid));
+ host->rd_atomic = dec_int(sizeof(host->rd_atomic));
}
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/rds.c
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/rds.c 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/rds.c 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,8 +1,8 @@
/*
* qperf - handle RDS tests.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -94,6 +94,7 @@
sync_test();
while (!Finished) {
int n = sendto(sockfd, buf, Req.msg_size, 0, (SA *)&RAddr, RLen);
+
if (Finished)
break;
if (n != Req.msg_size) {
@@ -159,6 +160,7 @@
sync_test();
while (!Finished) {
int n = sendto(sockfd, buf, Req.msg_size, 0, (SA *)&RAddr, RLen);
+
if (Finished)
break;
if (n != Req.msg_size) {
@@ -202,6 +204,7 @@
SS raddr;
socklen_t rlen = sizeof(raddr);
int n = recvfrom(sockfd, buf, Req.msg_size, 0, (SA *)&raddr, &rlen);
+
if (Finished)
break;
if (n != Req.msg_size) {
@@ -293,7 +296,7 @@
laddr.sin_addr.s_addr = INADDR_ANY;
laddr.sin_port = htons(0);
if (bind(lfd, (SA *)&laddr, sizeof(laddr)) < 0)
- error(SYS, "bind failed");
+ error(SYS, "bind INET failed");
port = get_socket_port(lfd);
encode_uint32(&port, port);
@@ -355,7 +358,7 @@
setsockopt_one(sockfd, SO_REUSEADDR);
rds_makeaddr(&sockaddr, &socklen, host, port);
if (bind(sockfd, (SA *)&sockaddr, socklen) != SUCCESS0)
- error(SYS, "bind failed");
+ error(SYS, "bind RDS failed");
set_socket_buffer_size(sockfd);
return sockfd;
}
@@ -454,6 +457,7 @@
char *serv, size_t servlen, int flags)
{
int stat = getnameinfo(sa, salen, host, hostlen, serv, servlen, flags);
+
if (stat < 0)
error(0, "getnameinfo failed: %s", gai_strerror(stat));
}
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/socket.c
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/socket.c 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/socket.c 2009-05-29 14:04:56 UTC (rev 294)
@@ -1,8 +1,8 @@
/*
* qperf - handle socket tests.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -252,7 +252,7 @@
/*
* Measure UDP latency (server side).
*/
- void
+void
run_server_udp_lat(void)
{
datagram_server_lat(K_UDP);
@@ -273,6 +273,7 @@
sync_test();
while (!Finished) {
int n = send_full(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -304,6 +305,7 @@
buf = qmalloc(Req.msg_size);
while (!Finished) {
int n = recv_full(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -337,6 +339,7 @@
sync_test();
while (!Finished) {
int n = send_full(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -378,6 +381,7 @@
buf = qmalloc(Req.msg_size);
while (!Finished) {
int n = recv_full(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -418,6 +422,7 @@
sync_test();
while (!Finished) {
int n = write(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -449,6 +454,7 @@
buf = qmalloc(Req.msg_size);
while (!Finished) {
int n = recv(sockFD, buf, Req.msg_size, 0);
+
if (Finished)
break;
if (n < 0) {
@@ -481,6 +487,7 @@
sync_test();
while (!Finished) {
int n = write(sockFD, buf, Req.msg_size);
+
if (Finished)
break;
if (n < 0) {
@@ -521,10 +528,11 @@
sync_test();
buf = qmalloc(Req.msg_size);
while (!Finished) {
- struct sockaddr_storage clientAddr;
+ SS clientAddr;
socklen_t clientLen = sizeof(clientAddr);
int n = recvfrom(sockfd, buf, Req.msg_size, 0,
(SA *)&clientAddr, &clientLen);
+
if (Finished)
break;
if (n < 0) {
@@ -584,6 +592,7 @@
if (!ai->ai_family)
continue;
*fd = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+ setsockopt_one(*fd, SO_REUSEADDR);
if (connect(*fd, ai->ai_addr, ai->ai_addrlen) == SUCCESS0)
break;
close(*fd);
@@ -637,6 +646,7 @@
debug("accepted %s connection", kind_name(kind));
set_socket_buffer_size(*fd);
close(listenFD);
+ debug("receiving to %s port %d", kind_name(kind), port);
}
@@ -737,7 +747,7 @@
get_socket_port(int fd, uint32_t *port)
{
char p[NI_MAXSERV];
- struct sockaddr_storage sa;
+ SS sa;
socklen_t salen = sizeof(sa);
if (getsockname(fd, (SA *)&sa, &salen) < 0)
@@ -758,8 +768,10 @@
send_full(int fd, void *ptr, int len)
{
int n = len;
+
while (!Finished && n) {
int i = write(fd, ptr, n);
+
if (i < 0)
return i;
ptr += i;
@@ -779,8 +791,10 @@
recv_full(int fd, void *ptr, int len)
{
int n = len;
+
while (!Finished && n) {
int i = read(fd, ptr, n);
+
if (i < 0)
return i;
ptr += i;
Modified: branches/ofed-1.4.1upgrade/qperf/trunk/src/support.c
===================================================================
--- branches/ofed-1.4.1upgrade/qperf/trunk/src/support.c 2009-05-29 14:02:35 UTC (rev 293)
+++ branches/ofed-1.4.1upgrade/qperf/trunk/src/support.c 2009-05-29 14:04:56 UTC (rev 294)
@@ -2,8 +2,8 @@
* qperf - support routines.
* Measure socket and RDMA performance.
*
- * Copyright (c) 2002-2008 Johann George. All rights reserved.
- * Copyright (c) 2006-2008 QLogic Corporation. All rights reserved.
+ * Copyright (c) 2002-2009 Johann George. All rights reserved.
+ * Copyright (c) 2006-2009 QLogic Corporation. All rights reserved.
*
* This software is available to you under a choice of one of two
* licenses. You may choose to be licensed under the terms of the GNU
@@ -143,6 +143,7 @@
{
uint64_t l = 0;
uint8_t *p = (DecodePtr += n);
+
while (n--)
l = (l << 8) | (*--p & 0xFF);
return l;
@@ -178,6 +179,7 @@
qmalloc(long n)
{
void *p = malloc(n);
+
if (!p)
error(0, "malloc failed");
return p;
@@ -235,7 +237,7 @@
{
send_sync(msg);
recv_sync(msg);
- debug("synchronize %s completed", msg);
+ debug("synchronization complete");
}
@@ -246,6 +248,7 @@
send_sync(char *msg)
{
int n = strlen(msg);
+
send_mesg(msg, n, msg);
}
@@ -257,8 +260,8 @@
recv_sync(char *msg)
{
char data[64];
+ int n = strlen(msg);
- int n = strlen(msg);
if (n > sizeof(data))
error(BUG, "buffer in recv_sync() too small");
recv_mesg(data, n, msg);
@@ -380,9 +383,9 @@
getaddrinfo_port(char *node, int port, struct addrinfo *hints)
{
struct addrinfo *res;
-
char *service = qasprintf("%d", port);
int stat = getaddrinfo(node, service, hints, &res);
+
free(service);
if (stat != 0)
error(0, "getaddrinfo failed: %s", gai_strerror(stat));
@@ -400,6 +403,7 @@
setsockopt_one(int fd, int optname)
{
int one = 1;
+
if (setsockopt(fd, SOL_SOCKET, optname, &one, sizeof(one)) >= 0)
return;
error(SYS, "setsockopt %d %d to 1 failed", SOL_SOCKET, optname);
@@ -407,26 +411,60 @@
/*
- * This is called when a SIGURG signal is received. When the other side
- * encounters an error, it sends an out-of-band TCP/IP message to us which
- * causes a SIGURG signal to be received.
+ * This is called when a SIGURG signal is received indicating that TCP
+ * out-of-band data has arrived. This is used by the remote end to indicate
+ * one of two conditions: the test has completed or an error has occurred.
*/
void
-urgent_error(void)
+urgent(void)
{
+ char *p, *q;
char buffer[256];
- char *p = buffer;
- char *q = p + sizeof(buffer);
+ /*
+ * There is a slim chance that an urgent message arrived before accept
+ * returned. This is likely not even possible with the current code flow
+ * but we check just in case.
+ */
+ if (RemoteFD < 0)
+ return;
+
+ /*
+ * This recv could fail if for some reason our socket buffer was full of
+ * in-band data and the remote side could not send the out of band data.
+ * If the recv fails with EWOULDBLOCK, we should keep reading in-band data
+ * until we clear the in-band data. Since we do not send enough data for
+ * this case to cause us concern in the normal case, we do not expect this
+ * to ever occur. If it does, we let the lower levels deal with it.
+ */
+ if (recv(RemoteFD, buffer, 1, MSG_OOB) != 1)
+ return;
+
+ /*
+ * If the indication is that the other side has completed its testing,
+ * indicate completion on our side also.
+ */
+ if (buffer[0] == '.') {
+ set_finished();
+ return;
+ }
+
+ /*
+ * If we are the server, we only print out client error messages if we are
+ * in debug mode.
+ */
if (!Debug && !is_client())
die();
+ p = buffer;
+ q = p + sizeof(buffer);
buf_app(&p, q, remote_name());
buf_app(&p, q, ": ");
timeout_set(ERROR_TIMEOUT, sig_alrm_remote_failure);
for (;;) {
int s = sockatmark(RemoteFD);
+
if (s < 0)
remote_failure_error();
if (s)
@@ -436,6 +474,7 @@
while (p < q) {
int n = read(RemoteFD, p, q-p);
+
if (n <= 0)
break;
p += n;
@@ -517,7 +556,7 @@
return 0;
if (RemoteFD >= 0) {
- send(RemoteFD, "#", 1, MSG_OOB);
+ send(RemoteFD, "?", 1, MSG_OOB);
write(RemoteFD, buffer, p-buffer);
shutdown(RemoteFD, SHUT_WR);
timeout_set(ERROR_TIMEOUT, sig_alrm_die);
More information about the Pkg-ofed-commits
mailing list