OProfile / Bugs / #264 opjitconv gets "record past end of file" error

summary: thx --> opjitconv gets "record past end of file" error
Description has changed:

Diff:

--- old
+++ new
@@ -1,4 +1,7 @@
-Doing system-wide profiling of a large Java application, a user intermittently encountered problems with opjitconv not processing all JIT dump files. Passing "--verbose=debug" to operf allowed us to collect detailed information on the JIT conversion process.  In the log, we see the following error:
+Doing system-wide profiling of a large Java application, a user intermittently encountered problems with opjitconv not processing all JIT dump files, and operf displays the message
+  JIT dump processing exited abnormally: 1
+
+Passing "--verbose=debug" to operf allowed us to collect detailed information on the JIT conversion process.  In the log, we see the following error:

     record past end of file
     JIT convert error -1

Group: -->

Maynard Johnson - 2014-01-24

We've gotten to the root cause of the problem reported by this user. The JIT dump file was being corrupted by the JVM process writing stdout messages to it. This happened because of the manner in which the user's script for starting the JVM was actually closing stdout and stderr. Thus, when the JIT dump file was opened by the agent, it used file descriptor #1 (which is normally stdout.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-01-24

In spite of the fact that the actual cause of this particular user's problem was of their own making, there is a lack of synchronization between the JIT agent library and opjitconv as I mentioned in my opening description. I will attach a patch that addresses that issue.

0001-Plug-timing-hole-between-JIT-agent-and-opjitconv-tha.patch

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-01-24

assigned_to: hanseld
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-01-24

Daniel, please review the patch I just posted. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

hanseld - 2014-02-11

I reviewed the patch.

I have seen that flock() can not set the lock for a file that is located in a NFS. In such a scenario the function fcntl() is suggested.
If that scenario is not significant I would suggest to document that limitation in the source code and/or in the docomentation of oprofile.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-02-11

From what I understand, flock can safely be used by local NFS clients, no matter the kernel version. In a scenario where NFS is being used to back the oprofile_data session directory, we would have two local NFS clients using flock -- opjitconv and the JIT agent library. So I don't think there's a concern here.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

hanseld - 2014-02-12

I think you are right.
The restriction for NFS located files could be meant in a scenario where one process used flock() to set the lock on one network side and another process want to use flock() on the other network side (i.e. both processes are on different network locations).
In our case of Oprofile both processes are located on the same network side.

So I can give my sign-off to your patch. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-02-12

status: open --> open-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-02-12

The patch has been pushed upstream. Thanks for reviewing, Daniel!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maynard Johnson - 2014-09-11

Status: open-fixed --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

opjitconv gets "record past end of file" error

Group

Searches

Help

#264 opjitconv gets "record past end of file" error

Discussion