PerlIO_write Question

Discussion:

(too old to reply)

Reinhard Pagitsch

2005-10-13 12:08:11 UTC

Hello,

I tryed the following in my XS module:

int Write1Byte(PerlIO* fh, long what)
{
return PerlIO_write(fh, what, 1);
// or return PerlIO_write(fh, (void*)what, 1);
}
With this I got a crash.

if I do this it works:
int Write1Byte(PerlIO* fh, long what)
{
int i;
char buf[1] = { '\0' };
buf[0] = (char)what;
i = PerlIO_write(fh, buf, 1);
return i;
}

Can anyone tell me why?

Thank you,
Reinhard

Muppet

2005-10-13 15:04:13 UTC

Permalink

Post by Reinhard Pagitsch
Hello,
int Write1Byte(PerlIO* fh, long what)
{
return PerlIO_write(fh, what, 1);
// or return PerlIO_write(fh, (void*)what, 1);
}
With this I got a crash.
int Write1Byte(PerlIO* fh, long what)
{
int i;
char buf[1] = { '\0' };
buf[0] = (char)what;
i = PerlIO_write(fh, buf, 1);
return i;
}
Can anyone tell me why?

Because (void*)what is not the same as &what.

I assume you're wanting to write value-of-long-what-cast-as-char to the stream.

(void*)what tells PerlIO_write() to find the data to write at the address
value-of-long-cast-as-memory-address, which isn't what you want at all.

The way you did it in your second function, copying the data into a stack
buffer, is more work, but will be safer across byte orders than taking the
address of the function argument. That is, you don't have to worry about
machine byte order messing up whether *(char*)long will be the top byte or
bottom byte.

--
muppet <scott at asofyet dot org>

Tassilo von Parseval

2005-10-13 15:12:11 UTC

Permalink

Post by Reinhard Pagitsch
int Write1Byte(PerlIO* fh, long what)
{
return PerlIO_write(fh, what, 1);
// or return PerlIO_write(fh, (void*)what, 1);
}
With this I got a crash.

No wonder. It's horribly wrong. PerlIO_write receives a memory address
as second argument. When you pass it 'what', which is a long, this is
interpreted as a memory address. Is that really what you want? I assume
you want:

return PerlIO_write(fh, (void*)&what, 1);

But this is then byte-order dependant: It will write the
least-significant byte on little-endian and most significant byte on
big-endian.

Post by Reinhard Pagitsch
int Write1Byte(PerlIO* fh, long what)
{
int i;
char buf[1] = { '\0' };
buf[0] = (char)what;
i = PerlIO_write(fh, buf, 1);
return i;
}
Can anyone tell me why?

I don't understand the purpose of Write1Byte(). Why is the second
argument a 'long'? Can you explain the semantics of this function first?
Is it supposed to write the least significant byte of an integer value?

Tassilo

--
use bigint;
$n=71423350343770280161397026330337371139054411854220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($m+=8)<=200);

Reinhard Pagitsch

2005-10-13 15:56:00 UTC

Permalink

Post by Tassilo von Parseval

Post by Reinhard Pagitsch
int Write1Byte(PerlIO* fh, long what)
{
return PerlIO_write(fh, what, 1);
// or return PerlIO_write(fh, (void*)what, 1);
}
With this I got a crash.

No wonder. It's horribly wrong. PerlIO_write receives a memory address
as second argument. When you pass it 'what', which is a long, this is
interpreted as a memory address. Is that really what you want? I assume
return PerlIO_write(fh, (void*)&what, 1);
But this is then byte-order dependant: It will write the
least-significant byte on little-endian and most significant byte on
big-endian.

Oh, thanks, my mistake.

Post by Tassilo von Parseval

Post by Reinhard Pagitsch
int Write1Byte(PerlIO* fh, long what)
{
int i;
char buf[1] = { '\0' };
buf[0] = (char)what;
i = PerlIO_write(fh, buf, 1);
return i;
}
Can anyone tell me why?

The function will be used in a module I am writing to parse AFPDS files
and, lets say, merge them.

This function is to write e.g. the control character (0x5A) to the file.
It will also be used to write CRLF and the triplex (e.g. 0xd3eeee: NOP
record) to file if needed. I changed the type of the function parameter
to short which shall be better, I think.

A record of an AFPDS contain (not all) 6 parts:
+ The control character (1 byte)
+ the length of the record (2 bytes)
+ the triplet (3 bytes)
+ flag (1 byte)
+ two bytes with 0x0000
+ data the rest
Whereas IBM says (manual) the control charater is not a part of the record.

So my intention was to use some set of general functions where I can
pass the needed informations for writeing to file. Maybe there is a
better way?

regards
REinhard

Tassilo von Parseval

2005-10-14 07:39:56 UTC

Permalink

Post by Tassilo von Parseval
But this is then byte-order dependant: It will write the
least-significant byte on little-endian and most significant byte on
big-endian.

Do you have more informations about big and little endian? Maybe some
links? But not too theoretical, more practical.

I can't remember where I learned about it. But it's not so extremely
difficult that it couldn't be explained easily.

Consider a decimal number always consisting of 4 digits, such as 1234.
The maths behind that is

1234 = 4*10^0 + 3*10^1 + 2*10^2 + 1*10^3

This is big-endian "digit"-order because the big (that is, significant)
digit comes first. Little-endian would be 4321, because the little (last
significant) digit comes first.

With computers the digits are actually bytes, and the math becomes:

01 02 03 04 = 4*256^0 + 3*256^1 + 2*256^2 + 1*256^3 = 16909060

Again, the above is big-endian. In little-endian, the bytes are simply
reversed:

04 03 02 01 = 4*256^0 + 3*256^1 + 2*256^2 + 1*256^3 = 16909060

In your example you had a 'long', which we assume is 4-bytes (but could
be 8 bytes as well).

long what = 1;

Internal representation as char-buffer is for big-endian:

unsigned char bytes[4] = { 0, 0, 0, 1 };

and little-endian:

unsigned char bytes[4] = { 1, 0, 0, 0 };

Therefore, if you do a

Write1Byte((char*)&what);

0 is spit out for big-endian and 1 for little-endian because each writes
the first byte of the character buffer.

There are macros to swap the byteorder:

#define swap32(n) \
n = ((n & 0xff000000) >> 24) | \
((n & 0x00ff0000) >> 8) | \
((n & 0x0000ff00) << 8) | \
((n & 0x000000ff) << 24)

/* for 16-bit integers (shorts) */
#define swap16(n) \
n = ((n & 0xff00) >> 8) | \
((n & 0x00ff) << 8)

It should be obvious how the above macros work: a bitmask is used to extract a
certain byte and then it's shifted to the appropriate position.

You can avoid endian-issues by de- and en-coding numbers manually:

/* char b[4] contains the bytes */
int num = b[3] | (b[2]<<8) | (b[1]<<16) | (b[0]<<24);

and correspondingly for the other direction:

b[0] = (num & 0xff000000) >> 24; /* most significant */
b[1] = (num & 0x00ff0000) >> 16;
b[2] = (num & 0x0000ff00) >> 8;
b[3] = num & 0x000000ff; /* least significant */

Also, big-endian is often referred to as network byte-order and
little-endian as host byte-order. The libc contains some conversion
functions as well ('h' standing for host, 'n' for network in the
below functions):

#include <netinet/in.h>

uint32_t htonl(uint32_t hostlong);
uint16_t htons(uint16_t hostshort);
uint32_t ntohl(uint32_t netlong);
uint16_t ntohs(uint16_t netshort);

Tassilo

--
use bigint;
$n=71423350343770280161397026330337371139054411854220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($m+=8)<=200);

Muppet

2005-10-14 12:23:01 UTC

Permalink

Post by Tassilo von Parseval
Also, big-endian is often referred to as network byte-order and
little-endian as host byte-order.

You'll also hear them referred to as Motorola and Intel byte orders.
Motorola's processors tend to be big-endian, and Intel's little-endian.

The original debate was over whether to make math cheaper to
implement or numbers easier to follow when dealing with hexdumps.

http://en.wikipedia.org/wiki/Endianness

--
To me, "hajime" means "the man standing opposite you is about to hit
you with a stick".
-- Ian Malpass, speaking of the Japanese word for "the beginning"

Reinhard Pagitsch

2005-10-14 06:56:15 UTC

Permalink

Post by Tassilo von Parseval
But this is then byte-order dependant: It will write the
least-significant byte on little-endian and most significant byte on
big-endian.

Do you have more informations about big and little endian? Maybe some
links? But not too theoretical, more practical.

Thank you,
Reinhard

Reinhard Pagitsch

2005-10-14 07:04:58 UTC

Permalink

Post by Tassilo von Parseval
But this is then byte-order dependant: It will write the
least-significant byte on little-endian and most significant byte on
big-endian.

Do you have more informations about big and little endian? Maybe some
links? But not too theoretical, more practical.

Thank you,
Reinhard