on ARM I'm seeing output like:
cyclicte-623 0....... 19619418us+: tracing_mark_write: hit latency threshold (2000 > 2097)
That's because of a format mismatch in
tracemark("hit latency threshold (%d > %d)", diff, tracelimit);
diff is a u64 and tracelimit an int. So on ARM the string is passed in r0,
tracelimit in r1 and diff in r2+r3. vsnprintf used in tracemark only
expects two ints passed and so only uses r1 and r2 yielding the permutation
in the output.
This patch also adds a gcc attribute to tracemark that helps catching
similar bugs. In this case just adding the attribute but not touching
the call site, would result in:
src/cyclictest/cyclictest.c: In function ‘timerthread’:
src/cyclictest/cyclictest.c:899:4: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘uint64_t’ [-Wformat]
---
Hello
after some chatting with Clark and John I dropped the c99 stuff and added the
attribute annotation.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Clark Williams <clark.williams@gmail.com>
Hello,
While playing around with hackbench I discovered that I would sometimes
get an enormous time reported, even if the run time would be less than a
second or so. The problem was that the struct timeval start was not
initialized until after all children have been created. But if the
program receives a signal before this is done, the start time is left
uninitialized.
I propose that in such situations an error message be displayed, like
the following patch does.
Please let me know if this is acceptable.
Regards,
/Ciprian
Signed-off-by: Clark Williams <williams@redhat.com>
e.g.
cyclictest -a4,6-8 -t5
will use 5 threads, assigned round-robin to the set of CPUs {4,6,7,8}.
CPU 4 will get threads 1 and 5, CPU 6 gets thread 2, CPU 7 gets thread 3, and
CPU 8 gets thread 4.
As explained in the updated manpage, libnuma >= v2 is required for these
arbitrary CPU sets. With libnuma v1, the -a option behaves as before. As
before, compiling without libnuma is supported. The command usage help is fixed
up at compile time to always show the correct usage of the -a option.
Also note that, since numa_parse_cpustring_all() wasn't available in early
libnuma v2 versions, we use numa_parse_cpustring(). This means you'll have to
use taskset in some cases (isolcpus kernel parameter) to add the desired CPUs to
the set of allowed cores, e.g.:
taskset -c4-6 cyclictest -a4-6
Tested with out libnuma (numactl), and with versions 1.0.2 and 2.0.9-rc3.
Signed-off-by: Aaron Fabbri <ajfabbri@gmail.com>
(cherry picked from commit 5375ab86e77881d8043e5e309bb8daf5a84cc05f)
Signed-off-by: Clark Williams <clark.williams@gmail.com>
These changes make the align option truly optional as claimed.
1. Rename disaligned to offset for readability.
2. Fix the aligned option so that if no optional argument is given,
the offset defaults to 0
3. Fix some white space problems as reported by checkpatch.pl in the kernel
Signed-off-by: John Kacur <jkacur@redhat.com>
This patch provides and additional -A/--align flag to cyclictest to align
thread wakeup times of all threads as closly defined as possible.
When running multiple threads in cyclictest (-S or -t # option) the threads
are launched in an unsynchronized manner. Basically the creation order and
time for thread creation determines the start time. For provoking a maximum
congestion situation (e.g. cache evictions) and to improve reproducibility
or run conditions the start time should be defined distances appart. The
well defined distance is implemented as a offset parameter to -A/--align
and will offset each threads start time by the parameter * the sequentially
assigned thread number (par->tnum), together with the -d0 (distance in the
intervals of the individual threads) this alignment option allows to get
the thread wakeup times as closely synchronized as possible.
The method to sync is simply that the thread with par->tnum == 0 is chosen
to set a globally shared timestamp, and all other threads use this timestamp
as their starting time rather than each calling clock_gettime() at startup.
To ensure synchronization of the thread startup the setting of the global
time is guarded by pthread_barriers.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Reviewed-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Clark Williams <clark.williams@gmail.com>
Change return value from option parsing to be enumerated type
rather than a character. Hopefully this will clean up the option
handling a bit and not confuse me when I come back to add yet
another option to cyclictest.
Signed-off-by: Clark Williams <clark.williams@gmail.com>
Commit ad27df7 ("Reimplement better child tracking and improve error
handling") changed the way of reporting pid/error after creating a
child. It will return an union which is a mix pid_t, pthread_t and a
signed long long for errors.
Now on 32bit x86 both pid_t and pthread_t are four byte in size and are
stored in the first 4 bytes. Now if the most significant bit of the long
long variable happens to be set by chance (because nobody really
initializes the variable here) then error variable will be negative. On
little endian machines the assignment of pid or threadid won't reset the
sign bit and you see this:
| Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
| Each sender will pass 100 messages of 100 bytes
| 0 children started. Expected 40
| sending SIGTERM to all child processes
| signaling 0 worker threads to terminate
| Creating workers (error: Success)
A machine with proper endian handlig (that is big endian) would reset
the sign bit during the assignment of pid and I would not have to make
this patch :)
While here, I make create_worker() since it is not used outside of this
file.
Cc: David Sommerseth <davids@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Clark Williams <clark.williams@gmail.com>
Add the --notrace/-A option, intended to be used in conjunction
with the -b option. This will cause cyclictest to exit when a
threshold is hit, but will not perform any tracing operations,
allowing more sophisticated tracing to be done externally.
Signed-off-by: Clark Williams <clark.williams@gmail.com>
This code adds the -F/--fifo option to cyclictest. Using the
--fifo <path> option will cause cyclictest to create a named
fifo at <path> and will dump the current run statistics to that
fifo when it is opened an read.
Signed-off-by: Clark Williams <clark.williams@gmail.com>
Huge latencies are observed (close to 1 second) when certain
options are used in cyclictest.
The problem was 1st introduced at commit da4956cbca
("use interval on first loop instead of 1 second"). It removed
the 1 second first timing loop out of the main path in cyclictest
but left it in two other paths, namely the ones triggered by
these two options:
-r --relative use relative timer instead of absolute
-s --system use sys_nanosleep and sys_setitimer
which in turn causes the huge latencies of close to 1 second to
be reported by cyclictest with certain uses of those two options.
Here we extend the original commit to remove the 1 second
hardcoded timer values from the RELTIME and ITIMER options, by
simply using the actual interval provided instead.
Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: John Kacur <jkacur@redhat.com>
Don't tag copies of files in BUILD created when building an rpm
Without this change tags finds both copies, eg: for tag cyclictest.c
# pri kind tag file
1 F F cyclictest.c BUILD/rt-tests/src/cyclictest/cyclictest.c
1
2 F F cyclictest.c src/cyclictest/cyclictest.c
With this change, only the later one is found
Signed-off-by: John Kacur <jkacur@redhat.com>
version 2:
- Add new placeholders in rt-tests.spec-in to make the replacements by
"make rpm" more visible to future maintainers of rt-tests.spec-in
- fix typo of my name in rt-tests.spec-in
rt-tests can be built without NUMA:
make NUMA=0
But "make rpm" does not have a way to be successful without NUMA:
build_rt-tests_0.85> make rpm
for F in cyclictest signaltest pi_stress rt-migrate-test ptsematest sigwaittest svsematest pmqtest sendme pip_stress hackbench *.o .depend *.*~ *.orig *.rej rt-tests.spec *.d *.a ChangeLog; do find -type f -name $F | xargs rm -f; done
rm -f hwlatdetect
rm -f tags
rm -rf BUILD BUILDROOT RPMS SRPMS SPECS releases *.tar.gz rt-tests.spec tmp
git log >ChangeLog
mkdir -p releases
mkdir -p tmp/rt-tests
cp -r Makefile COPYING ChangeLog src tmp/rt-tests
tar -C tmp -czf rt-tests-0.85.tar.gz rt-tests
rm -f ChangeLog
cp rt-tests-0.85.tar.gz releases
sed s/__VERSION__/0.85/ <rt-tests.spec-in >rt-tests.spec
rpmbuild -ba --define "_topdir /a/home/frowand/me/src/rt-tests/build_rt-tests_0.85" --define "_sourcedir /a/home/frowand/me/src/rt-tests/build_rt-tests_0.85/releases" --define "_builddir /a/home/frowand/me/src/rt-tests/build_rt-tests_0.85/BUILD" rt-tests.spec
error: Failed build dependencies:
numactl-devel is needed by rt-tests-0.85-1.fc12.src
make: *** [rpm] Error 1
The following patch allows the rpm to be built without NUMA, with the command:
make NUMA=0 rpm
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: John Kacur <jkacur@redhat.com>
make rpm creates the dirs BUILDROOT and SPECS that are missed by distclean.
Gather all rpm related DIRS to the RPMDIRS and add that to distclean.
Signed-off-by: John Kacur <jkacur@redhat.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Frank Rowand <frank.rowand@am.sony.com>
The files in the tmp dir are generated during make release.
These are the kind of generated files that should be removed for distclean,
So add tmp. make release can be slightly simplified by then depending
on distclean instead of clean.
Signed-off-by: John Kacur <jkacur@redhat.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Frank Rowand <frank.rowand@am.sony.com>
Don't tag the copies of the source files placed in the tmp directory
during the creation of a release.
Without this change tags finds both copies, eg: for tag cyclictest.c
# pri kind tag file
1 F C F cyclictest.c src/cyclictest/cyclictest.c
1
2 F F cyclictest.c tmp/rt-tests/src/cyclictest/cyclictest.c
1
With this change only the first one is found.
Signed-off-by: John Kacur <jkacur@redhat.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Frank Rowand <frank.rowand@am.sony.com>
Clean up cyclictest formatting:
Change leading spaces to tabs.
Align function parameters.
Place type of function on same line as function name.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: John Kacur <jkacur@redhat.com>
V3: Use src/lib/error.c functions instead of fprintf.
Fix printf format warnings for 32 bit vs 64 bit systems with cast.
One issue with using warn() and info() instead of fprintf is that
the compiler no longer warns about format mismatches.
Fix bad continuation line white space prefix.
Remove unused variable zero_diff.
cyclictest: ARM panda clock resolution will be ~30 usec unless
CONFIG_OMAP_32K_TIMER=n, resulting in a poor latency report.
This patch does _not_ fix the problem, it merely provides the
instrumentation to make it visible. The value of measured
resolution is useful information for any system.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
- Fixed up minor white space problem.
Signed-off-by: John Kacur <jkacur@redhat.com>
V3: unchanged from V2
cyclictest getopt_long() parameter clean up.
Clean up before following patch which will add a new option.
Some elements of long_options were not in alphabetical order.
Some elements of optstring were not in alphabetical order.
'-e', '--latency' was missing help text
short form of --duration ('D') was missing from optstring
Change a few instances of leading spaces to tabs.
Add white space to long_options to improve readability.
Some cases of the switch processing the result of
getopt_long() were not in alphabetical order.
Did _not_ clean up option value parsing and processing.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: John Kacur <jkacur@redhat.com>
Conflicts:
src/cyclictest/cyclictest.c
For compilation to work
-D_GNU_SOURCE -Isrc/include
is needed to be passed to the compiler. For Debian packaging several
things are added but not these two from above. So be a bit more friendly
and add them unconditionally. There is no harm if they are included in
the user supplied CFLAGS and so passed twice.
Moreover be a bit more correct about CFLAGS/CPPFLAGS. Both should be
passed to the compiler with CFLAGS taking options for the compiler and
CPPFLAGS taking options for the preprocessor. This is also needed for
Debian packaging where the helper scripts set CPPFLAGS.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: John Kacur <jkacur@redhat.com>
Discovered while compiling with "hardening flags"
For Debian 7.0 (aka wheezy) packages it's recommended to use several
hardening flags, the default on amd64 being:
CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
CPPFLAGS=-D_FORTIFY_SOURCE=2
LDFLAGS=-Wl,-z,relro
This patch doesn't fix all warnings but at least makes all programs compile
again by not using char *variables as printf format strings.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: John Kacur <jkacur@redhat.com>
Minor fix to make working with git nicer.
Now that we're building a lib, we need to exclude it from git status output.
Do the same for patches we generate or apply.
Signed-off-by: John Kacur <jkacur@redhat.com>
Add back call to the tracemark function but only if we're
using the breaktrace option and only when we actually hit
the breaktrace threshold.
Signed-off-by: Clark Williams <williams@redhat.com>
Currently if a non-root user requests a priority higher than the soft limit in
/etc/security/limits.conf
the call to sched_setscheduler will silently fail and the user will be running
with priority of 0. Cyclictest will not complain, and display the
requested priority resulting in seemingly poor results.
The following patch fixes this by doing two things.
1. If the requested priority is higher than the soft limit but lower than the
hard limit, it will raise the soft limit to the requested priority.
2. If the requested priority is higher than the hard limit, it will fail with a
warning.
The patch should not affect privileged users.
Reported-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Clark Williams <williams@redhat.com>
V2: use type casting instead of ugly constant in format string
Fix printf format string to fix compile warning for ARM 32 bit target.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Clark Williams <williams@redhat.com>
When the --verbose option is selected, the first value for each thread is
incorrectly reported as zero.
This is because when collecting the first value, the index into stat->values is
incremented from zero to one before storing the value. But when printing the
values, the first value printed is stat->values[0], which has been initialized
to zero.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Clark Williams <williams@redhat.com>
Fix the machinetype check for cross-compiling.
This has been tested on an x86_64 Fedora host for an x86_64 target and
an ARM target. Additional testing would be greatly appreciated.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Clark Williams <williams@redhat.com>
The '-a' option is always ignored if --smp or --numa is specified. Fix the
warning message to not depend on --smp or --numa occuring first.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Clark Williams <williams@redhat.com>
Avoid annoying warning message when tracing is not requested and the debug
file system is not available.
The same test already protects against calling event_enable_all().
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Clark Williams <williams@redhat.com>
This fixes a segfault on ARM when the '-a' option is used.
man sched_setaffinity says to use pthread_setaffinity_np() when using the
POSIX threads API.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Tested-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Clark Williams <williams@redhat.com>
Start of an ongoing process to have error strategy where return is
checked and if error, exit with appropriate status.
Signed-off-by: Clark Williams <williams@redhat.com>